The Journal of 
Experimental Education 


A periodical report of scientific investigations relating to child development, 
curriculum, 


June, 1953 


Page 


An Investigation of the Relationship Between Teaching Effectiveness and 
the Teacher's Attitude of Acceptance Harold ], Reed 277 


Measuring Knowledge and Application: An Experimental Investigation 
Donald E. Smith and Marvin D. Glock 327 


An Inverted Factor Analysis Study of Student-Rated Introductory Psychol- 
ogy Instructors A. W. Bendig 333 


Judgments by 820 College Executives of Traits Desirable in Lower- 
Division College Teachers M. R. Trabue 337 


PUBLISHED QUARTERLY 


Published by Dembar Publications, Inc., 
Madison 3, Wisconsin. 
Entered as second-class matter October 17, 1938 at the post office at Madison, 
Wisconsin, under the act of March 3, 1879, 





EDITORIAL BOARD 


cyt Sagan peep manta: pemuer epbaen gechalbaeramenceneg tases 


spo cha 


qa eg ys 
sity, Late ¢, Indiana. Editorially ie for mate- 
$2, leagning, teaching and supervision, published 


are Bre Nar 


aiken ae ie 
riculum construction, each June. 


CONTRIBUTING EDITORS 


"Fabecing, Stic, PRunple University, Puiledeiphis ‘2, 


ou an pe ne 
Leg, J.,Brucrknar, Profesose of Education, Univerity of 
oy Ebene lg ft, maw 
OyE Rea teas, ieee Merten 
MeSuley of Calllocoia, Berkeley %, Californias” * 
Lele, i chisel, Atecate Poteet, Eéwenion, 
"“Ghitn tua hee ies iow 
maya, Crm, Pele of et Pein 


ES ey. OS eh, Peewee Sead, Ales, 


eye tee Peel 


ina tree Doers Sree em 
er 


MSs Pepto’ “<Prev 
Otalin Cakes a, 


ge Ye 


na ragrinn Uakvereay of“ cautocais: 
Miguatisen, University of Calliornin, Berkley, California. 
P ebem Calllersia, Lov Angeles, Callforais. 
Ba A. Lincoln, Consulting Psychologist, Halifax, 
pete i 4 La tae College, 
“fgets . 

3B rete ie, Comet 7 eee 2703 Wisconsin Av- 
2 2 a ais: 6 ee coe 


Willard C. Olson, Professor of Education, Director 
Research in Child Development, Universi: Michigan 
Anna Arbor, Michigan. alg 


Vaiworth R. Plumb, 
ee 


8. L. Pressey, Educational Psychology, 
Stase Univerelen Coloma Ore ™ 


Clarence E. Professor of Education, University 
of Wisconsin, Madison, Wisconsin. ai 

William Reitz, Associate Professor of Educa 
of Education Examiner, Wayne University, 


Division of Education and 
(Branch), Duluth, 


College 
2, 


D. Rinsland, Professor of Education 
ch, The University of 
Norman, Oklahoma. 
Robert T. Rock, Jr., Professor of , of 
of - 
age ree: Graduate School, Pesdkeas Waive 
Puta en leeches eras Sate 
Sca 
eg Se ton, Foateonsy of Mdeention, Dube Univer 


Board of Examiners, Michigan State Col- 
ee Set 2 Michigan. 


H Seashore, Director, Division, The Paychologi- 
pt g hemes ol New Tock tk, New York. 


Director 





David Specialist in Tests 


Consultant, 
Federal Agency, U. 8. 
Washington, D. . 


Professor of Educational Psychology, 
a PLT > ‘Alabama, University, Alabama. 


Yory Pope Graduate Hope, 300 Rast Som Bureet, Now 


Robert L. Associate Professor of Education, 
Be ycerpry College, Columbus University, New York City. 
lg = hg gy of Psychology, Ohio State 


Sait, Ske 
roto Bureau a Spe Services, 


a ea ee ed 


Helen M. Walker po ER 
ody Poe cog Btoom od seat 


L. Wellman, Professor of Child Welfare 
Betresarch tation, State Uni Towa City, 


Galversty, 33 


and Measurements, 
Deg any 


ity of 


hry HF, Hills, — 


oe a , Professor of Education, Directuz cf Paycho- 
"Clinic, i S Seen, Baa 
Ponape gsapgreay 


Saeecets phase York Uni- 
ae na TE “¢ 





Journal of Experimental Education 


Volume XXI 


June, 


1953 Number 4 


AN INVESTIGATION OF THE RELATIONSHIP 
BETWEEN TEACHING EFFECTIVENESS AND 
THE TEACHER’S ATTITUDE 
OF ACCEPTANCE 


HAROLD J. REED* 
Long Beach, California 


I. Purpose 


THE PURPOSE of this study is to investi- 
gate the relationship between the teacher’s atti- 
tude of acceptance and his teaching effectiveness. 
This investigation will examine the hypothesis 
that the teacher who is the more accepting of him- 
self and his environment is the more effective 
teacher. 


Il. Problems 


The broad scope of this study can be defined 
in terms of the following problems: 

1. Is there a determinable relationship be - 
tween the teacher’s effectiveness in the classroom 
and that aspect of a teacher’s personality organi- 
zation, or attitude, which permits him to be an 
accepting person? 

2. Can the predictive instrument employed in 
the present investigation be made to provide mean- 
ingful information concerning the teacher’s atti- 
tude of acceptance? 

3. Can the criterion measures employed in 
the present investigation be made to provide re- 
liable information concerning the students’ eval- 
uations of their teachers’ effectiveness and the 
students’ feelings concerning their teachers’ atti- 
tudes toward them? 

4. Can the criterion measures employed in 
the present investigation be made to show mean- 
ingful relationships between the different criter- 
ion groups? 

5. Can it be shown that seli-evaluations of teach- 
ing effectiveness are reliable criterion measures? 
6. Are there any meaningful relationships be- 
tween certain biographical data and the teachers’ 

attitude of acceptance? 





Ill. Limitations of Study 





The scope of this study can be defined further 
in terms of its limitations. 

1. This investigator has chosen to approach 
the complex problem of teacher effectiveness in 
terms of the personality dynamics of the teacher. 
All other factors are thereby excluded. 

2. It is not the purpose of this investigator to 
determine what type of personality is most effec- 
tive as a classroom teacher, nor to determine 
the structure of the optimum personality. Rather 
is it his purpose to analyze one aspect, or dimen- 
sion, of the optimum personality organization 
which has been found to be significant in other as- 
pects of the study of human nature, and to deter- 
mine its relationship to effective teacher behav- 
ior. 

3. This study will not attempt to compare sec- 
ondary teachers on this dimension of acceptance 
with other teachers nor will it try to compare 
teachers with other occupational groups. 

4. The criterion measures used in this inves- 
tigation are designed primarily to sample the 
feelings of the students toward their teachers. 

It is not the purpose of this investigator to define 
the effective teacher nor to establish valid criter- 
ion measures of teacher effectiveness, beyond 
what is involved in the criterion measures used 
in this investigation. 


IV. Need for Study 





It may seem presumptuous of this investigator 
to think that something more can be added to the 
many studies which have been made in the area 
of teaching efficiency. He may be equally bold 
in his latent criticism of the assumptions and 





#394 Orlena Avenue 





278 JOURNAL OF EXPERIMENTAL EDUCATION 


methodoloyy of other experimenters. Onthe other 
hand, the simple fact that as yet there is no evi- 
dence of unanimity of opinion regarding the nature 
of the effective teacher—and therefore no accept- 
able technique for selecting the good teacher — 
would seem to justify the assumption that a new 
hypothesis or a new predictive measure might 
add some knowledge or clarify some disputed 
factor. 

In referring to correlation studies between in- 
telligence and effective teaching, Super concluded, 
‘‘Apparently the occupation ‘teacher’ is too broad 
a category for psychological study. ’’! He also 
indicated that some occupational groups ‘‘ were 
not distinguishable from men-in-general’’2 in 
their interest patterns. Teaching was consider- 
ed by Super to be one of those groups. This at- 
titude is shared by many investigators, but others 
believe that the search for more refined meas- 
ures and descriptions of the effective teacher 
should continue. 

After reviewing 150 studies in the measure- 
ment and prediction of teaching efficiency, Barr? 
observed that the predictive devices used would 
indicate that improvements could be made. He 
called attention to the fact that in most studies 
the reliability was high but the validity was un - 
known. The low correlations of validity may have 
been due, according to Barr, to a weakness in 
the criterion or predictive measures. He also 
felt that measures had been confused with evalu- 
ations, and data had been consistent only when 
repeated under comparable conditions. 

The literature is replete with supplications 
for better integrated teachers. Baxter has stated 
that ‘‘the classroom must be considered a social 
laboratory in which children learn to live with 
others cooperatively. ’’4 Snygg and Combs 5 have 
called attention to the role played by the teacher 
in assisting students to discover realistic and 
effective solutions to their present problems. The 
complex living conditions of today seem to be 
charging the American schools and the teachers 
within those schools with continually broadening 
responsibilities. These increased responsibil- 
ities of the school seem to demand teachers of 
peculiar powers and abilities. 





(Vol. XXI 


The importance of the teacher’s role in the 
educational process has never been questioned. 

If an administrator were to employ only those 
candidates who possessed all of the traits of the 
desired teacher, his school would be very under- 
staffed. At the same time the administrator is 

in great need of predictive cues which will enable 
him to select candidates who most closely approach 
the ideal. Any knowledge which can assist him 
will improve the school’s contribution to the stud- 
ent and society. 

The selective process is 2 continuous one, ac- 
cording to Ryans, ‘‘beginning as early as possible 
in student life and continuing through teacher train- 
ing and on into the employment period. ’’6 At each 
succeeding level the refining process should be 
more discriminating. 

Not only is there a continuous process of selec- 
tion, but there is a need for a continuous or con- 
sistent criterion of judgment. Once the policyis 
accepted, school counselors, training institutions, 
and administrators are better able to perform 
their functions. 

Growth is also a continuous process. As our 
fund of knowledge of the teaching process in- 
creases through experience and experimental ev- 
idence, the findings should be shared with active 
teachers. 

Any evidence of the importance of the teacher’s 
attitude toward himself, toward others, and to- 
ward situations, as it relates to his teaching ef- 
ficiency, should be investigated. Teachers often 
have deep-rooted habits that get in the way of 
achieving their goals. ‘‘For example, the habit 
of judging pupil behavior in terms of its effects 
on the accomplishment of teacher’s own purposes 
for the child, or for the group, interferes with 
understanding the child; so does the habit of judg- 
ing pupil behavior on the basis of the teacher’s 
personal prejudices and cultural values. 7 

The great success of the volunteer study groups 
sponsored by the American Council on Education 
attests to the effect of in-service training pro- 
grams for increasing teacher effectiveness. It is 
hoped that the results of this study may, as others 
have done, provide some insight to teachers 
through study groups. 





1. D. A. Super, Appraising Vocational Fitness (New York: Harper and Brothers, 1949), p. 101. 





2 Ibid., Pp. 383. 


3. A. Se Barr, "The Measurement and Prediction of Teaching Efficiency," Journal of Experimental 


Education, (June 1948), p. 20k. 





l. B. Baxter, Teacher-Pupil Relationships (New York: Macmillan Co., 192), pe 2. 





5. De Snygg and A. W. Combs, Individual Behavior (New York: Harper and Brothers, 1949), pe 22. 





6. De Ge Ryans, "Appraising Teacher Personnel," Journal of Experimental Education, XVI (Septem- 





ber 1947), P- 1. 


7. Division on Child Development and Teacher Personnel, Daniel A. Prescott, Helping Teachers Un- 








June, 1953) 


A theoretical framework has been necessary 
for educational research. Educators are respon- 
sible for evaluating the effects of various educa- 
tional procedures. Philosophy, and more partic- 
ularly the philosophy of education, has attempted 
to make our experience intelligible through crit- 
ical thinking. 8 Education has also looked to psy- 
chology for a theoretical frame of reference as 
well as for facts. However, psychologists, ac- 
cording to Snygg and Combs, have not developed 
‘‘a frame of reference which brings their unwieldy 
body of information into unity and consistency.’’9 
Until some purposeful and meaningful order is 
created from this atomistic approach, neither ed- 
ucation nor psychology can proceed toward effec - 
tive solutions to the problems of education. 

This investigator will attempt to keep in mind 
the needs of education in the area of teacher ef- 
fectiveness and at the same time utilize a predic- 
tive device of psychology with a frame of refer- 
ence which seems to be consistent with the phil- 
osophy of both education and psychology. 

Most studies in this area have standardizeda 
predictive device against standardized criterion 
measures, or they have attempted to correlate 
standardized tests against a sample of active or 
apprentice teachers. Few attempts have been 
made to standardize teacher norms on these tests. 
That, it would seem, would be a meaningful con- 
tribution. In this study, the investigator will at- 
tempt to create his own criterion measures and 
establish his own norms on a predictive device 
of a type which has been found to be effective in 
related areas. 

This investigator believes that, in many stud- 
ies, implicit errors result from the use of cri- 
terion and predictive measures which are not ap- 
plicable to the purposes of the investigator, and 
that these errors have contributed to rather than 
resolved the confusion in the measurements of 
teacher efficiency. For purposes of this study, 
the investigator will sample the feelings and the 
judgment of students, administrators, and teach- 
ers on an unstructured scale, and thereby attempt 
to free himself of the influence of norms which 
may be inappropriate. The investigator willalso 
attempt to sample a dimension of personality 
dynamics which has not previously been meas- 
ured, and to determine its relationship to teach- 
er effectiveness. It is hoped that a break with 





REED 279 


conventional design may produce some new un- 
derstanding of this all-important area in the ed- 
ucational field. 


SECTION I 
RATIONALE FOR STUDY 


THE RATIONALE for this study is to be 
found in the thinking of those psychologists and 
educators committed to the opinion that behavior 
is a function of a well-defined and consistent at- 
tempt on the part of the organism to maintain a 
unified and integrated personality organization. 
As has been previously stated, it is not the pur- 
pose of this investigator to add any new thinking 
to the theory of personality. Rather is it his 
purpose to submit some of the concepts of prev- 
ious researchers to further investigation for pur- 
poses of clarification and verification. More 
specifically, it is proposed to determine the ex- 
tent to which their hypotheses can be helpful in 
teacher selection and training. 

It has been assumed from a review of the lit- 
erature and experience in the fields of philosophy 
of education and clinical psychology that there 
is an optimum quality of personality that is more 
effective than another, and that this quality of 
personality is based upon the individual’s goal of 
maintaining some structure of values that is 
meaningful to him and acceptable to his society. 


The basic value concepts are the mean- 
ingful core about which the personality 
is organized. An integrated personal- 
ity will emerge if the core values are 
harmonious and valid, while mental con- 
flict will occur if the core concepts are 
inharmonious or incompatible. 1 


These core values are not readily perceptible, 
and for identification of them one must rely upon 
their manifestations in behavior patterns, traits, 
attitudes, feelings, etc. One trait that has been 
examined carefully is acceptance. It has been 
assumed that the individual who is accepting is 
unthreatened; he feels secure; and if he is un- 
threatened he will have no need to be aggressively 
hostile or to defend himself. Dynamically, it 





derstand Children, prepared for Commission on Teacher Education (Washington, D. C.: American 
Council on Education, 1945), p. 21. 


8. J. T. Wahlquist, The Philosophy of American Education (New York: The Ronald Press, 192), p. 5. 





9. Snygg and Combs, op. cit., p.205. 


Section II 


1. D. He. Prescott, Emotion and the Educative Process (Washington, D.C. 





cation, 1938), pe 207. 


: American Council on Edu- 





280 JOURNAL OF EXPERIMENTAL EDUCATION 


can be said that the unthreatened, secure, or ac- 
cepting personality is one that is well integrated 
or harmoniously balanced. 

Kurt Goldstein’s 2 and W. B. Cannon’s3 con- 
cepts of equilibrium and homeostasis describe 
the organism’s attempt to maintain itself in the 
face of physiological disturbances. The same 
phenomena can be noted in the organism’s affec- 
tive nature. ‘‘Behavior expresses the effort to 
maintain the integrity and unity of the organiza- 
tion.... The nucleus of the system, around which 
the rest of the system revolves, is the individual’s 
idea or conception of himself. ’’4 

The drive within the individual to maintain a 
balance, or his integrity, has been variously de- 
fined, but always in terms of the individual’s 
needs and his perception of the world around 
him. Sherman and Sherman5 concluded from 
their study of emotional responses in infants that 
there were two opposite tendencies, ‘‘rejecting 
the stimulus and accepting the stimulus.’’ It 
would seem, therefore, that the self is the locus 
of behavior. What the self does as he reacts to 
his environment is determined by the way he per- 
ceives his world, whether it is acceptable or 
whether it is threatening. 

From our knowledge of the phenomena of per- 
ception, we can see that our perceptions do not 
come simply from the objects around us, but 
from our past experience. These perceptions 
seem to be screened through our past experience 
with those objects, the affective conditioning re- 
sulting from that experience, and the future goals 
we have in mind for ourselves. As Kelly points 
out, we take a large number of clues, none of 
which is reliable, add them together, and make 
what we Can of them. All that this gives us isan 
estimate of our surroundings.7 The Hanover In- 
stitute demonstrations in perception showed that 
distortion «of perception was significant, but they 
also showed the disturbing effect that distortion 





(Vol. XXI 


had on the one viewing it. Some became angry, 
some laughed, and some were embarrassed. When 
old habits fail to satisfy, the inconsistency pre- 
sents a problem for the individual. Snygg and 
Combs have said, 


Those individuals whose perceptions 
made possible the satisfaction of need 
are happy, effective and efficient people. 
On the other hand, those whose differen- 
tiations do not permit of adequate need 
satisfactions are likely to be ineffective, 
unhappy and generally thwarted person- 
alities. 8 


The differentiation of the perceptual world is 
in terms of those things that are consistent with 
the individual’s idea or conception of himself, ac- 
cording to Lecky, 9 or the individual’s self-inter- 
ests and value concepts, as stated by Prescott,10 
or in terms of the phenomenal self, as described 
by Snygg and Combs. The latter writers state 
that the basic human need is ‘‘the preservation 
and enhancement of the phenomenal self’’ and 
‘‘the phenomenal self includes all those parts of 
the phenomenal field which the individual exper - 
iences as part or characteristic of himself.’’11 
Raimy was one of the first to work in this area 
and he defines the self-structure as the self-con- 
cept which ‘‘is the more or less organized per- 
ceptual object resulting from present and past 
self-observation, ’’12 

The dynamic interrelations between the situ- 
ations of our phenomenal field and the desire of 
the individual to maintain a state of balance in 
accordance with his concept of himself, his phen- 
omenal self, give rise to emotional behavior. 
‘‘Attitudes and value concepts define for us the 
areas of experience which will carry the possi- 
bilities of arousing emotional responses. ’’13 For 
those who are well adjusted and emotionally ma- 





K. Goldstein, Human Nature in the Light of Psychopathology (Cambridge: Harvard University 





Press, 190). 


3. We B. Cannon, The Wisdom of the Body (New York: W. W. Norton & Cos, Inc., 1939). 





le P. Lecky, Self-Consistency (New York: Inland Press, 195), p. 150. 





5. M. and I. C. Sherman, "Sensory-Motor Responses in Infants," Journal of Comparative Psychol- 


ogy, V (1925), pp. 53-68. 





6. E. C. Kelly, Education for What is Real (New York: Harper and Brothers, 197), p. 3h 





Te Ibid., p. 3h. 





8. D. Snygg and A. W. Combs, Individual Behavior (New York: Harper and Brothers, 1949), pp. 113- 
us 


9. Lecky, op. cit., pe 150. 
10. Prescott, op.cit., p. 89. 





June, 1953) 


ture, it can be said that they are capable of ac- 
cepting into their organization any and all aspects 
of reality. If too many elements in the phenom- 
enal field are unacceptable to the self, they will 
be rejected. If test results are consistent with 
the individual’s already differentiated concept of 
himself, there is likely to be little difficulty of 
acceptance. When they deviate, there is a prob- 
lem. Levine and Murphy 14 found that pro-Com- 
munist sympathizers were not only able to mem- 
orize pro-Communist materials more readily 
than anti-Communist literature, but their recall 
was better. The opposite was true of the anti- 
Communist group. 


phenomenal environment 
phenomenal self 


self-concept 


Snygg and Combs have illustrated in the above 
diagram how the environment and the self-concept 
are related to each other. ‘‘The closer to the 
center of this figure an enhancing or threatening 
differentiation occurs, the more vividly it willbe 
experienced. "15 ‘‘The closer a deviant percep- 
tion lies to that portion of the phenomenal self 
which we have called the self-concept, the more 
difficult change is likely to be. ’’16 

The organization of the self-concept and its 
functional aspects can be understood through a 
Study of attitudes. Attitudes have been shown to 
play an important role. They partially define the 
areas of emotionality, or those areas which the 
individual finds difficult to incorporate into his 
phenomenal self, and which hence cause frustra- 
tion and aggression. 

Attitudes are formed through this constant in- 
teracting process. Those that are acceptable, 
or have value for the individual, are retained and 





REED 281 


become habits; those that have negative value are 
rejected. Allport lists four ways that attitudes 
are formed:17 (1) through the accretion of ex- 
perience, or the integration of numerous specif- 
ic responses of a similar type; (2) by individua- 
tion or differentiation; (3) through dramatic ex- 
perience or trauma; and (4) through the imitation 
of parents, teachers, or playmates they are some- 
times adopted readymade. 

Allport further defines attitudes 18 as well- 
defined objects of reference, (1) either material 
or conceptual, (2) either specific or general, (3) 
signifying an acceptance or rejection of the ob- 
ject or concept of value to which they are related. 
They lead one to approach or withdraw, to affirm 
or to negate. To those who believe in the unitary 
approach, there is but one basic attitude, pur- 
pose, or motive, and that is a constant striving 
for unity. The emotional states resulting from 
a disruption of this unity cannot be treated inde- 
pendently. 19 Love is the emotion subjectively 
experienced in reference to a person or object 
already assimilated. Grief is experienced when 
the personality must be reorganized due to the 
loss of one of its supports. Hatred is an impulse 
of rejection felt towards unassimiable objects. 
Experiences which increase the sense of psycho- 
logical unity, or well-being, give rise to the emo- 
tion of joy. Prescott has expressed the same 
thought 20 that attitudes determine the meanings 
of situations. Conditions menacing our immed- 
iate safety arouse fear, so happenings which 
jeopardize the attainment of security in the fu- 
ture give rise to anxiety. 

The extreme opposition to this unitary 
approach has been recently stated by Thorndike.2! 
He would postulate a hierarchial organization of 
selves. Traits, such as honesty, are not unitary 
to Thorndike, but rather are collections of inde+ 
pendent features. He feels that even the factor 
studies of Cattell and Guilford are unproductive. 
Emotional states are due to the proclivities of 
gene determiners, and personality can be ex - 
plained or modified through the action and modi- 
fication of the stock through eugenics. He warns 
of too strict an interpretation of behavior by the 





ll. Snygg and Combs, op. cit., p. 58 


12. V. C. Raimy, 
XII (1948), pe 153. 


13. Prescott, op. cite, p. 89. 


"Self Reference in Counseling Interviews," Journal of Consulting Psychology, 





ly. J. Me Levine and G. Murphy, "The Learning and Forgetting of Controversial Material," Journal 
of Abnormal and Social Psychology, LVIII (1945), pp. 507-517. 





15. Snygg and Combs, op. cit., p. 129. 


16. Snygg and Combs, op. cit., p. 157. 


17. G. W. Allport, "Attitudes," Handbook of Social Psychology, Carl Murchison, Editor (Worcester, 
Mass.: Clark University Press, 1935), pp. D10-Oll. 





282 JOURNAL OF EXPERIMENTAL EDUCATION 


holists, connectionists, or purposivists. 

Regardless of the manner by which one at- 
tempts to explain behavior, there are certain 
points of agreement. All would (1) agree that 
behavior is causal, (2) postulate some concept of 
optimum adjustment, (3) say that there are some 
situations which have a positive effect upon the 
individual and other which are negative, and (4) 
that any change in the modus operandi is extreme- 
ly difficult. 

Whether personality is approached from the 
reference point of explanation or modification, 
one concept seems to emerge as all-important, 
namely, acceptance. Adjustment can be meas- 
ured on a dimension of self-approval and self - 
disapproval, acceptanceor rejection. Raimy 
postulated in his study, previously referred to, 
that the approval, disapproval, or ambivalence 
one feels for the self-concept, or some of its 
sub-systems, is related to his personal adjust- 
ment, 

This conceptual framework for the regarding 
of personality has been well summarized by Carl 
R. Rogers22 following his clinical experience 
and an analysis of accumulating research evidence. 
His theory of personality and behavior has been 
stated in the form of nineteen propositions; ‘‘some 
of these propositions must be regarded as assump- 
tions, while the majority may be regarded as hy- 
potheses subject to proof or disproof. ’’ 

The studies reviewed in Chapter II (not repro- 
duced in this report; see original thesis on file 
in Library, University of Southern California), 
under Section 5, ‘‘Related Studies in the Fieldof 
Psychotherapy, ’’ concluded that increasing ac- 
ceptance of self and the assumption of responsi- 
bility for the self constituted positive changes in 
the personality organization. Snygg and Combs 
describe this sequence or development of insight 
as follows:23 (1) Individual perception of a dif- 
ference existing between the demands of the sit- 
uation and his phenomenal self. (2) Acceptance, 








(Vol. XXI 


or the inclusion of a new concept into the phen- 
omenal self by means of a new differentiation of 
self. They point out that changes may occur grad- 
ually, traumatically, or in sheltered groups. 

It is proposed in this dissertation to submit 
this hypothesis of acceptance as a measure of 
adjustment and effectiveness to experimental 
proof. If the well-adjusted person is an accept- 
ing person, it should follow that an effective tea- 
cher is an accepting teacher. The teacher who 
is unthreatened should be accepting and accept- 
able. There should be a minimum of defensive 
behavior on the part of the accepting teacher. It 
is assumed that, if the teacher is accepting and 
rejecting certain elements in his phenomenolog- 
ical field, the student is doing likewise. If there 
is a conflict between the needs of the two, there 
will be problems. Changes must take place in 
the behavior of one or the other, or both, if har- 
monious adjustment is to be effected. 

It is further assumed that, if a conflict exists, 
one must resolve the problem before the other. 
That one should be the teacher. It is not impos- 
sible for the student to accomplish insight and 
corrective action, but if our concept of the tea- 
cher’s role is valid, it is the teacher’s respon- 
sibility to provide the atmosphere in which the 
student may adequately ‘‘evolve’’ and grow. 

The unthreatened or accepting teacher is one 
who can best accomplish the general function of 
education as stated by Dewey. ‘‘Of these three 
words, direction, control, and guidance, the 
last best conveys the idea of assisting through 
cooperation the natural tendencies of the individ- 
uals guided. '’'24 Kirkpatrick25 expressed the 
same thought in his challenge to teachers to al- 
low the students to learn to think for themselves. 
Democracy is lived, not learned. 

Prescott, after several years of conducting 
teacher study groups on understanding children, 
concluded: ‘‘Whatever may be the root from which 
develops an emotional acceptance of all young- 





18. G. We. Allport 


Personality, A Psychological Interpretation (New York: Henry Holt and Co., 





1937), pp» 294, 295 
19. I. Lecky, Op. cit., Pe 152. 


20. Prescott, op. cite, pe 190. 


21. E. L. Thorndike, "The Organization of a Person," Journal of Abnormal and Social Psychology, 


XLV (1950), pp- 137-145. 





22. C. R. Rogers, Client-Centered Therapy (Boston: Houghton Mifflin Co., 1951), Ch. ll. 





23. Snygg and Combs, op. cit., p. 9. 





2h. John Dewey, Democracy and Education (New York: Macmillan Co., 1916), p. 28. 


25. We 7 Kirkpatrick, "Democracy and Respect for Personality," Progressive Education, XVI (1939), 
pp. 83-90. 








June, 1953) 


Sters, we have found that this attitude character - 
izes the teachers who are most effective in their 
work, ’’26 

What has been found to be effective in therapy, 
a learning process, may also be effective in the 
learning process called education, according to 
Rogers.27 And Snygg and Combs 28 felt that the 
student with a tremendous drive toward growth 
and self-enhancement required only practicable 
and socially acceptable opportunities for growth 
and development. 

The effect on the child of the interacting be- 
havior between the teacher and the child is ob- 
vious, according to Wickman, 29 ‘‘By counter- 
attacking the attacking types of problems and by 
indulging the withdrawing types, the underlying 
difficulties of adjustment in each case are in- 
creased and the undesirable expressions of so- 
cial behavior are further entrenched. ’’ It would 
therefore seem necessary, if teachers are to 
provide a good learning situation, that they un- 
derstand the student’s behavior and be able to 
accept it. They must appreciate that the student 
lives in a different perceptual world and that that 
world is amenable to change. Good behavior 
and good grades may be the goal of the elemen- 
tary child, but they may be a disgrace to the high 
school student. The teacher who can understand 
and accept the student’s concept of himself has 
already contributed much to the learner’s learn- 
ing by providing an accepting environment in 
which the learner feels worthwhile and in which 
he will therefore be more eager to assume the 
responsibility for his own learning. 


SECTION Il 
SOURCE OF DATA AND METHOD 


THE PURPOSE and need for this study 
as contained in Section I of this report, and Chap- 
ter II (not included in this report; see original 
thesis on file in Library, University of Southern 
California), indicated that some variation in de- 
sign might possibly produce some improvement 
in the understanding of the effective teacher. 
The literature and the experience of the investi- 
gator and colleagues have offered a possible di- 
mension of the personality organization which, 


REED 





283 


if present, conceivably could be contributing to 
all effectiveness of school teachers. Section Il 
in this report contained a description of an atti- 
tude of acceptance as this dimension of person- 
ality which this investigator has sought to exam- 
ine for its relationship to teacher effectiveness. 

The data for this study were collected from 
the following sources: 

1. The criterion measures consisted of eval- 
uations of the teachers’ effectiveness and relat- 
ed aspects of teacher effectiveness on three 
scales, Scales A, B, and C. Evaluations of 160 
teachers were made by the students and admin- 
istrators at three secondary schools. Of the 
160 evaluated teachers, 104 volunteered to par- 
ticipate in the predictive phase of this study. 
These teachers were asked to evaluate the m- 
selves on the same three scales used by the ad- 
ministrators and students. It was therefore pos- 
sible to compare the difference between student 
and administrator ratings of those teachers who 
volunteered to participate in this study and those 
who did not. The 104 teachers who did volunteer 
will hereafter be called participating teachers 
and the 56 who did not will be referred to as non- 
participating teachers. 

2. The predictive measure of teacher effec- 
tiveness consisted of a sentence completion test 
from which a quantitative measure was obtained 
of the teachers’ attitude of acceptance. One hun- 
dred and four teachers who participated in the 
standardization and validation program complet- 
ed the sentence completion test divided into two 
parts on the basis of scoring principles. 

This Section will present a summary of the 
source of data used in this investigation. Section 
IV will present a detailed description and justi- 
fication for the criterion measures used. Sec- 
tion V will offer a rationale and use for the sen-° 
tence completion technique. Section VI will pre- 
sent the scoring principles used in this investi 
gation and the method of establishing reliability 
for the test. Section VII will summarize the 
findings from the criterion measures, and Sec- 
tion VIII will present the correlations between 
the criterion measures and the predictive meas- 
ures. 


I. Schools Participating in the Study 





The secondary schools selected for this study 





26. Division on Child Development and Teacher Personnel, D. H. Prescott, Helping Teachers Under- 





stand Children (Washington, D.C.: American Council on Education, i9hsy, pe 10. 


27. Rogers, op. cit., p. 38). 
28. Snygg and Combs, op. cit., p. 238. 


29. E. Ke Wickman, Children's Behavior and Teachers' Attitudes (New York: The Commormealth Fund, 





1928), pe 171. 





284 JOURNAL OF EXPERIMENTAL EDUCATION 


were chosen primarily on the basis of availabil- 
ity to the investigator. They represent, however, 
a fair cross section of the population and socio- 
economic status of a large metropolitan area in- 
cluding both urban and rural communities. 

Two of the three schools are city school dis- 
tricts and the third a union district. Twoofthem 
are primarily agricultural communities includ- 
ing some industry, and the third is an urban res- 


idential area contiguous to the city of Los Angeles. 


All three are large departmentalized schools of 
over one thousand students with 46, 83, and 103 
classroom teachers. 


fl. Criterion Data 





Criterion measures for this study consisted 
of student, administrator, and self evaluations 
of classroom teachers on three scales. Table 
I indicates that 160 teachers were evaluated, in- 
cluding 93 male teachers and 67 female teachers. 
Of the 160 evaluated teachers, 104 participated 
in the standardization and validation of the pre- 
dictive measure, 

In Schools I and I, all of the teachers were 
evaluated, while in School III only the partici- 
pating teachers were rated. A total of 10,115 
student evaluations were obtained on the 160 tea- 

. chers, with a mean number of 79. 8 for School I, 
60. 16 for School II, and 47.2 for School III. 

It will be aoted from the table that the number 
of 10th grade students of School III is out of pro- 
portion both to the number of 11th and 12thgrade 
students and as compared with schools I and II. 
This can be accounted for by the fact that the vol- 
unteer participating teachers in that school had 
more 10th grade than 11th and 12th grade classes. 
At Schools I and II, the number of classes visited 
were equally distributed among the three grades. 

The investigator and two assistants selected 
a sufficient number of classes of required 
courses in Schools I and II to obtain a sampling 
of all teachers. In order to disrupt the classes 
as little as possible, the students in each class 
evaluated each of their several teachers at one 
time. All evaluations for one school were ob- 
tained in one day. The average length of time 
for administration was fifteen minutes for all 
three scales. 


Il. Evaluation Scales 





Several problems presented themselves in 
gathering the criterion data. (1) Some evaluation 





(Vol. XXI 


of the teacher’s efficiency was desirable, as well 
as an indication of the students’ attitudes and 
feelings toward their teachers’ behavior. There- 
fore, at least two scales were essential, one for 
effectiveness and one or more to register the 
students’ feelings on other factors. 

It was felt that current rating scales were too 
structured for purposes of this study. An un- 
structured, or non-itemized, scale was prepared 
for the effectiveness evaluation. (See Appendix 
A of original thesis on file in Library, University 
of Southern California.) For Scales B and C, the 
situation was structured or itemized to the degree 
that the two extremes were defined for the evalu- 
ator. 

(2) The reliability of rating scales is depend- 
ent upon the clarity and consistency of the instruc- 
tions. Only the investigator and his two assist- 
ants administered the evaluations in 144 class - 
rooms. The instructions to the students were 
standardized and presented uniformly. (See Ap- 
pendix B of original thesis on file in Library of 
the University of Southern California. )l The stud- 
ents had no questions on Scale A. There were 
very few on Scale B. The students seemed to 
understand the difference between hypothetical 
Teachers X and Y. However, there was some 
question regarding the extent to which a certain 
teacher had the characteristics of X or Y. Some 
students felt that a certain teacher had some of 
each. In that case, the examiner had to repeat 
the instructions that the check mark was to be 
placed nearer X than Y if the student felt that the 
teacher had more of the X traits than Y traits, 
or if the teacher had all of the X traits in excess 
of the Y, but he had them to a lesser degree than 
the ideal established in the definition. After com- 
pleting the B Scale, the students had no questions 
on the C Scale, due to its similarity to the B 
Scale. 

(3) A third problem was the order of presen- 
tation of the three scales. It was desired to re- 
duce as much as possible the influence of a re- 
sponse set, which is normally present whenever 
one evaluates another person on more than one 
item or scale.2 It was decided that the general 
teaching effectiveness scale, Scale A, if present- 
ed first, would reduce the influence of a response 
set toa minimum. In the first place, Scale A 
was unstructured or non-itemized. The evalu- 
ator could set his own limits and use his own 
Standards. If Scales B and C were completed 
first, some carry-over from those structured 
scales would occur. In the second place, it was 





Section III 


1. The only variation in the instructions was at School III, where only the participating teachers 


were evaluated. 


2. Le S. Cronbach, "Response Sets and Test Validity," Educational and Psychological Measurement, 


VI (1946), pp. 475-L9). 








June, 1953) 


TABLE I 


SOURCE OF CRITERION DATA 





Schools 


Evaluations Ul Total 


Teachers Evaluated , 82 q 160 
Male 51 5 93 
Female 31 67 


Teachers Participating 51 ‘ 104 
Male < 31 j 60 
Female 20 ) 44 

Percent of Evaluated Teachers 

Participating : 62. 


Male ; 61. 
Female ; 64. 


Administrator Evaluations : 2 
Self Evaluations 30 
Student Evaluations 
10th Grade Male 132 
10th Grade Female 172 


llth Grade Male 149 
llth Grade Female 182 


12th Grade Male 116 
12th Grade Female 93 123 
Total 704 874 


Total Teacher Evaluations 3671 4933 


Mean Evaluations per Teacher 79.8 60. 16 





286 JOURNAL OF EXPERIMENTAL EDUCATION 


felt that all evaluators would be more familiar 
with the nature of Scale A than of Scale B and C, 
and they would therefore be able to complete it 
more easily. 

(4) A fourth problem involved the tendency or 
set to evaluate each teacher as above average. 
It was thought that, if each evaluator first ranked 
his several teachers, the set to rate each teach- 
er about the same could be broken, Such a pro- 
cedure would force greater variation and more 
discrimination. However, an equally difficult 
problem would be likely to appear, namely, the 
lack of freedom to place the teacher where the 
evaluator wished. Forcing the student to rank 
his teachers might interfere with giving him the 
opportunity to express his unstructured feelings; 
he would be forced into a position of deciding 
which of two or more teachers was the more ef- 
fective, and the resulting frustration could cause 
him to record his evaluation in a position other 
than that which he desired. 

An attempt was made to break any set toward 
ranking all teachers above average by reversing 
the position of the optimum. Theoretically, 
‘‘Teacher X’’ on Scales B and C represented the 
same level as ‘‘superior’’ on Scale A. At the 
same time, it was not felt that this would inter- 
fere with the evaluator’s judgment of the teacher. 

(5) A fifth problem was closely related to the 
fourth. A decision was made to construct the 
scales with no intervals. Again it was thought 
desirable to allow the evaluator complete free- 
dom in his recording and therefore introduce as 
few distractions as possible from the primary 
function, namely, the effectiveness of the teach- 
er as felt by the student or administrator. The 
scoring of the evaluations by the investigator was 
accomplished by a nine-point scale placed on the 
line, and the interval value recorded. 

The Scales used by the administrators were 
identical to those used by the students. The in- 
structions for Scale A were also identical. How- 
ever, for Scales B and C the administrator was 
asked to evaluate the teachers on the basis of 
what he felt would be the students’ feelings about 
the teachers’ attitudes, and methods of going 
about their teaching. 


IV. Data on Predictive Measures 





The data on the predictive measures were ob- 
tained from two sources: (1) scores on a sentence 
completion test, and (2) self-evaluations by the 
participating teachers on the three scales used 
by the students and administrators. A total of 
104 teachers participated in the standardization 
and validation of the sentence completion test, 
the primary source of data. 

Of the 104 participating teachers, 60 or 58 
percent were males and 44 were females. Sev- 
enty-two or 69 percent were married and 31 un- 





(Vol. XXI 


married. (One declined to state.) The average 
age of the group was 38.5 and the average num- 
ber of years of teaching experience was 13. 6. 
Sixty, or 58 percent, of the teachers were aca- 
demic instructors. 

An original test was compiled consisting of 91 
items or stimulus phrases. A majority of the 
items were taken from other published tests; 
the rest consisted of several items constructed 
by the investigator ot sample attitudes toward 
structured school situations. 

It was proposed to score the item responses 
on a uni-dimensional value of acceptance. Stand- 
ardization of the scoring system was accomplish- 
ed by correlating the test responses of the teach- 
ers at School I against the evaluations of the stud- 
ents on Scale A, teacher effectiveness. 

After 64 items were eliminated by observa- 
tion and difficulty of scoring, the tests of the 83 
teachers in Schools II and III were scored ‘‘blind’’ 
on the 27 items remaining. These total scores 
were validated against the student and adminis- 
trator evaluations at Schools II and II. 

Of the 64 items rejected, it was found that 13 
of them, structured to elicit responses to various 
drives, could be scored on another dimension. 
The investigator then standardized a scoring sys- 
tem for the 13 items. Again, the validation of 
these items was accomplished by correlating the 
scores against the student and administrator eval- 
uations of the participating teachers inSchools II 
and II. 

The chi-square technique was applied to the 
20 teachers rated highest by the students onScale 
A, and to the bottom 20. Fourteen of the 27 items 
in group 1, and 12 of the 13 in group 2, meta 
significant or near significant level. These 26 
items were retained and validated against the 
student and administrator evaluations of Schools 
I and Il. 

A secondary source of predictive data con- 
sisted of the teachers’ self-evaluations and cer- 
tain biographical data such as sex, marital sta- 
tus, number of dependents, subjects taught, years 
of age, and years of teaching experience. 


V. Sequence of Activities 





1. Investigator constructed criterion meas- 
ures: Evaluation Scales A, B, and C. 

2. The 91 items were selected for the predic- 
tive measure, the Sentence Completion Test. 

3. The investigator solicited the cooperation 
of secondary schools in this study. 

4. The student and administrator evaluations 
of all classroom teachers were obtained at School 
I during one day. The investigator administered 
the predictive measure to those teachers who vol- 
unteered to participate in the standardization and 
validation of the predictive measure. These tea- 
chers also filled out the evaluation scales on them- 





June, 1953) REED 


selves and completed a personal history blank. 

5. The same procedures as in Step 4 were fol- 
lowed at School II. 

6. At School III, the procedure was altered. 
The investigator appeared before the faculty of 
the school and explained the project and asked for 
volunteers. Those teachers who volunteered con- 
stituted the participating teachers, and they were 
the only ones evaluated by the students. It was 
hoped that this modification at School I might 
add some significant experience to the problem 
of obtaining adequate criterion measures. The 
predictive measures were obtained in the same 
manner as those in Schools I and II. 

7. While the data were being collected at 
Schools II and III, the investigator proceeded with 
the standardization of Parts I and II of the sen- 
tence completion test. 

8. Chi-square technique was used to refine 
Parts I and II of the sentenc > completion test. 

9. The sentence completion test scores for 
the teachers at Schools II and III were scored 
‘*blind’’ and were validated against the student, 
administrator, and self-evaluations. 


SECTION IV 


THE CRITERION MEASURES 





THIS SECTION will present the purpose of 
the criterion measures used in this investigation 
and the justification for the use of the raters used 
to evaluate the subjects on the scales used as cri- 
terion measures. 

The following three scales of teacher effective- 
ness as used in this study will be described 
briefly: 


1. Scale A. A non-itemized and unstructured 
scale of teacher effectiveness. 

2. Scale B. A structured scale to elicit the 
evaluators’ judgment of the teachers’ atti- 
tude toward the students. 

. Scale C. A structured scale to measure 
the rater’s judgment of the ease with which 
the teacher goes about his teaching. 


This section will also attempt to justify the 
use of the following three classes of raters onthe 





three scales described above. 


1. A representative sample of student judg- 
ment in each of three secondary schools. 

2. Administrator judgment. 

3. The judgment of the participating teachers 
themselves at the three schools. 


This investigator is assuming that the person 
best qualified to evaluate the teachers’ effective- 
ness is the one nearest the teachers, namely, 
the student. The other person most concerned 
with the teacher is the administrator. Athird 
person who in one way is most concerned with 
the teacher is the teacher himself. It was pro- 
posed, therefore, to secure evaluations from all 
three sources. 


Many investigators have criticized the prac- 
tice of using student judgment as a criterion 
measure because students are presumed to be 
immature and irrational. One instructor 1 asked 
his ninth-grade students to evaluate him two sem- 
esters later on what they liked and disliked about 
him. He was unaware of the first twotraits they 
disliked about him, namély, domineering and no 
sense of humor. He indicated that this was a 
‘‘blow.’’ However, it was a ‘‘great satisfaction 
to know they had really learned something, '’ the 
factor they liked most about him. The implica- 
tion would seem to be that the teacher could not 
accept those traits the students disliked about 
him. If they were true, the fact that the students 
really learned served as the ‘‘great reward which 
lends justification to his existence as a teacher."’ 
This article also introduced some information on 
our second question concerning the basis for judg- 
ment. Is one to accept the teacher's ability to 
‘‘teach us something’’ or should one consider the 
other traits of domineering, and no sense of hum- 
or, as the significant factors? 

Hart2 concluded from his survey of 10, 000 
high school students that students were mature 
enough to think straight on the question of teach- 
ers and teaching, and that they could weigh values 
and arrive at reliable and significant conclusions. 
Cook and Leeds 3 used fourth to sixth grade pupil 
ratings under the assumption that ‘‘they were 
smart enough to evaluate and not as sophisicated 





1. S. Callahan, "Is Teacher Rating by Students a Jound Practice?" School and Society, LXIX (1949), 


p- 98. 





2. F. W. Hart, Teachers and Teaching (New York: The Macmillan Co., 1934), p. 283. 





3. We We. Cook and C. H. Leeds, "Measuring the Teaching Personality," Educational and Psychological 
Measurement, VII (Autumn, 1947), pp. 399-10. = i a r 





288 JOURNAL OF EXPERIMENTAL EDUCATION 


as high school students. ’’ In their validation study 
they found that the pupil ratings correlated higher 
with inventory scores than the administrators’, 
but not as high as the experts’ ratings. 

While the educators’ philosophy and attitudes 
toward students relative to their right to make 
their own decisions have often been divided, there 
has been comparative agreement among psychol- 
ogists, curriculum makers, and methods workers 
that the students’ interests should be considered 
first, and that teaching should be at the students’ 
level. However, the privilege of evaluating the 
teachers’ work has been largely reserved by the 
adult. Only recently has student judgment been 
censidered a reliable criterion measure, There 
is often an element of threat to the authority fig- 
ure in the matter of student evaluation which is 
seldom explored. A consideration of this prob- 
lem usually revolves instead around the maturity 
of the student and whether his judgment can be 
trusted. It would seem to be consistent with our 
growing knowledge and appreciation of the indi- 
vidual’s ability at all ages to evaluate his own 
environment in terms of what stimulates him 
positively and negatively that the individual should 
also share in the planning and execution of the ed- 
ucational process. The executive authority in 
any social institution has been delegated toa few 
for efficient operation but in research, at least, 
consideration might well be given to the student’s 
values, 


Ill. The Use of Administrator Evaluations 





The administrator has long been the one to 
evaluate the work of his subordinates. It is not 
proposed in this study to eliminate his function. 
He is responsible for the operation of the school. 
One of his duties is the selection of teachers. If 
the students are not being taught, the teacher is 
not teaching, and the administrator is therefore 
indirectly concerned and directly responsible. 

There seems to be no disagreement concern- 
ing the administrator’s function in the evaluation 
process. There is some disagreement regarding 
the reliability of his judgment. If the investiga- 
tor is to use evaluative judgment as a criterion 
measure, he must consider the administrator. 
Some investigators have sought to use pupil growth 
as measured by objective tests and other data for 
their criterion measures, but all they have done 
is to remove direct subjective evaluation. Indirect- 
ly, the administrator’s judgment has already been 
considered in the standardization of the objective 
data. 





(Vol. XXI 


IV. The Use of Self Evaluations 





There is a growing body of knowledge and ex- 
perience which would indicate that the individual 
himself is capable of making evaluative judgments 
of himself. It has long been customary to re- 
view the ratings with the one rated. The purpose 
has been to improve the individual’s performance 
and to let him know how he is doing in relation to 
his colleagues and peers. This has been a frus- 
trating experience for both the rater and the on 
rated for two reasons. In the first place, the two 
seldom see ‘‘eye to eye’’ for objective reasons. 
There is the implication that one of the twois less 
right than the other. In the second place, the sub- 
jective defenses of both parties must be consid- 
ered. 

Few would deny that much can be gained by 
soliciting the assistance of the subject in any ac- 
tivity. This factor has been overlooked in most 
evaluation projects. On the other hand, itis well 
established clinical practice to use the evaluation 
of the subject of himself. More use of self eval- 
uations is necessary to establish the reliability 
of self evaluations for criterion measures. 


V. The Use of Itemized and Non-Itemized Scales 





Predictive measures must be standardized and 
validated against activeteachers. Withthese data 
of discriminating factors, selection can be per- 
formed more efficiently. In order to standard- 
ize the predictive measures, the criterion meas- 
ures must be carefully defined or one must accept 
the evaluator’s judgment for whatever reason or 
reasons are meaningful to him. 

Suchman 4 has pointed out the fact that both 
itemized, or defined, and non-itemized scales 
have been used in social science research. In 
the non-itemized approach, no attempt is made 
to produce a definition of the variable. 

In the selection and definition of itemized ag- 
gregates of attributes, the number of character - 
izing items that exist for any single variable is 
unlimited. And, as Suchman points out, there is 
little inherent reason why any one item is better 
than another. The final decision of whether an 
item characterizes a universe must be a subjec- 
tive one. Suchman5 challenges the research 
worker to be scientific by translating loose de- 
scriptive terminology into more precise classi- 
ficatory systems. Most research efforts, in the 
area of teaching effectiveness have attempted to 
define the meaning of some attribute or variable 
in such a way as to permit the classification of 





l. E. A. Suchman, "The Logic of Scale Construction," Educational and Psychological Measurements, X 
(1950), p. 82. 


5. Ibide, p. 79. 








June, 1953) 


persons according to the degree to which that at- 
tribute is absent or present. This has been the 
problem of scale construction. The selection 
and definition of these ‘‘meaningful variables”’ 
has been the problem which has contributed to 
much of the confusion. The result has been con- 
clusions such as Baxter’s 6 that there is no one 
pattern-personality of exact or particularized 
characteristics or any single configuration of per- 
sonal attributes which characterizes all effective 
teachers. 7 

Even though Suchman rules out the procedure 
of selecting items on the basis of some correla- 
tional test, 8 he states that there must be an ad- 
equate content interpretation for both acceptance 
and rejection of an item. This would seem to in- 
dicate that any hypothesis for the selection of an 
item is acceptable providing it is submitted to ex- 
perimental evidence. 


VI. The Use of Scale A 





Scale A as a criterion measure of teacher ef- 
fectiveness is considered to be non-itemized in 
that no traits or attributes are suggested to the 
rater. The investigator has deliberately with- 
held any suggestion of structure. It may be con- 
sidered a global approach rather than an atomis- 
tic one in that it purports to establish effective- 
ness by sampling a universe of unstructured sub- 
jective opinion and feelings. The predictive meas- 
ure may be carefully defined; and, if the predic - 
tive measure used does correlate with the global 
criterion, it may be concluded that some relation- 
ship exists without saying that it is causative. 
Some knowledge has therefore been gained. Con- 
tinued refinement of the definition and repeated 
sampling of the population may further add to our 
understanding of the effective teacher. 


VII. The Use of Scales B and C 





Additional evidence of the effectiveness of non- 
itemized versus itemized scales may be gained 
from a study of the relationship between evalua- 
tions registered on both types of scales by the 
same population. It was partly for this reason 
that Scales B and C were used in this study. 











REED 289 


It is hoped that Scales B and C will also add 
something to our definitions of effective teachers 
as proposed by Suchman. Baxter 9 pointed out 
as a result of her study that one cannot separate 
the teacher’s personality from his skill as an in- 
structor, as is suggested by many rating scales. 
She also observed that the effective teacher could 
identify himself with the learner because he was 
ready and willing to forget self and to rejoice 
with the learner in his satisfaction at discovering 
for himself. 

Hart 10 found that Teacher A, the most liked 
one, was human, friendly, companionable and 
‘‘one of us.’’ Teacher A was also interested in 
the pupils, and understanding. Those were the 
third and fourth reasons, respectively, given by 
the students. 

Roger’s 18th proposition 11 in his theory of 
personality and behavior reflects this concept: 


When the individual perceives and ac- 
cepts into one consistent and integrat- 
ed system all his sensory and visceral 
experiences, then he is necessarily 
more understanding of others and is 
more accepting of others as separate 
individuals. 


These observations and the investigator’s ex- 
perience contributed the framework for Scale B. 
It was desired on this variable to have the eval- 
uator’s expression of what he felt was the atti- 
tude of the teacher toward him, in the case of the 
student, and what the administrator felt was the 
attitude of the teacher toward his students. Tea- 
cher X on Scale B theoretically represented the 
accepting, understanding, and companionable 
teacher. 

Scale C attempted to incorporate the findings 
of Baxter 12 that the good teacher was poised and 
able to face conflicting demands without becoming 
hurried or petulant. The good teacher did not 
seem to be actuated by the necessity of having 
his pupils accomplish a given amount of work 
within the shortest time, but was leisurely and 
relaxed in his guidance. 

Hart 13 found that the second reason given by 
high school students for liking Teacher A was 





6. B. Baxter, Teacher-Pupil Relationships ‘New York: Macmillan Co., 1942), p. 10. 





Tbid., p. 9. 
Suchman, op. cit., p. 8h. 
Baxter, op. cit., p. 36. 


Hart, op. cit., p. 136. 


C. Re Rogers, Client-Centered’ Therapy (Boston: Houghton Mifflin Co., 1951), p. 520. 





Baxter, op. cit., pp. 73, 7h. 


Hart, op. cit., p. 13h. 





290 JOURNAL OF EXPERIMENTAL EDUCATION 


cheerfulness, happiness, and a good-natured dis- 
position with a sense of humor. 

The investigator’s experience in his observa- 
tion of teachers and counseling with teachers has 
led him to feel that the poor teacher takes his 
work too seriously and seems to be trying too 
hard. It would seem that some degree of these 
traits is necessary for effective teaching, but 
that there is a point of diminishing returns be- 
yond which they are detrimental. This type of 
teacher also seems to take too much responsi- 
bility for the guidance of his students. It seems 
to be very difficult for him to allow the student 
the privilege of learning for the sake of learning. 
It is also difficult for the poor teacher, or atleast 
the teacher who is having trouble with his teach- 
ing, to realize that growth is a functional, active 
process rather than a passive one of being told 
and shown. 


SECTION V 
THE PREDICTIVE MEASURE 


I. Summary of Rationale 





IT WAS indicated in the preceding section 
that the well-adjusted and effective individual is 
assumed to be an accepting individual. The ac- 
cepting individual has been shown to be one who 
perceives his environment to be unthreatening to 
his concept of himself. The individual whose be- 
havior indicates that he is well adjusted and in- 
tegrated seems to be the one who perceives the 
least number of inconsistent or inharmonious el- 
ements in his ‘‘phenomenal field.’’ Those ele- 
ments which are perceived to be unacceptable to 
the self will be rejected; those elements which 
are consistent with or which fit into the individ- 
ual’s concept of himself will be accepted. 

It is not the purpose of this study to determine 
the teacher’s ‘‘phenomenal self’’ or his self-con- 
cept, at least not for diagnostic or therapeutic 
purposes. The investigator has accepted the di- 
mension of acceptance as a tenable hypothesis 
and he will attempt to submit it to experimental 
analysis in a specific situation, namely, teacher 
effectiveness, 

It is assumed that a projective test can ade- 
quately determine the quantitative degree of this 
acceptance-rejection dimension possessed by the 
teacher. It is also assumed that if this trait can 





(Vol. XXI 


be projected to a measurable degree, it is detect- 
ed or ‘‘felt’’ by others. It is assumed that Scale 
A will measure the evaluator’s reaction to the 
teacher’s effectiveness. If some relationship is 
found to exist between the criterion measures and 
the projected trait on the predictive measure, it 
may be assumed that some relationship exists. 

It is further assumed that Scales B and C will 
measure some of the projected trait as detected 
or observed by the evaluator. 

The investigator has further delimited his study 
by not attempting to compare the degree of accep- 
tance in teachers as compared with other voca- 
tional groups. Additional studies may be conduc- 
ted to determine any significant differences with 
respect to this factor. 


Il. Available Devices 





There is a certain heirarchy of tests or de- 
vices to be used for screening purposes. During 
the war, attempts were made to screen those in- 
dividuals most likely to break under stress; Zubin, 
in his report on the investigation, 1 pointed out 
the difficulty of predicting stress tolerance from 
miniature experimental stress situations, and 
stated that group screening techniques have be- 
come a necessity. 

At a lower level than stress situation in war 
is the process of screening the maladjusted from 
the adjusted in a clinical situation. Beyond that 
level is the screening of job applicants. Zubin2 
indicated that critical scores or items my be sig- 
nificant in one situation but not in others, andthe 
problem of verifiable data becomes more compli- 
cated at the lower levels of intensity. This would 
mean that discriminating differences among ‘‘nor- 
mals,’’ or applicants, would be more elusive. 
At the same time the selection process, as of 
teachers, is an important element in the efficient 
operation and administration of the educational 
program. 

The consensus of writers in the field seems 
to be that rating scales and objective paper and 
pencil personality tests are inappropriate for 
measuring the teacher personality. Some ofthis 
inappropriateness is due to a narrow definition 
of criterion measures, according to Baxter. 3 
Structured personality tests of the usual paper 
and pencil type do not offer access to the person- 
ality make-up, or its processes, according to 
Hutt. 4 





1. J. Zubin, "Recent Advances in Screening the Emotionally Maladjusted," Journal of Clinica). Psy- 


chology, XVI (1948), p. 57. 
2. Zubin, op. cit., Pp. 59. 





3. B. Baxter, Teacher-Pupil Relationships (New York: Macmillan Co., 1942), pe 153. 





lh. M. Le Hutt, "The Use of Projective Methods of Personality Measurement in Army Medical Install- 
ations," Journal of Clinical Psychology, I (April 1945), p. 135. 








June, 1953) 


Rhode 5 pointed out that interviews, question- 
naires, and inventories have certain limitations, 
because of their direct questioning technique, 
which tends to make the individual self-conscious 
and defensive and usually prevents him from dis- 
closing his deeper self. 

The personality inventory, according to Zubin,6 
in its present form was devised to differentiate 
between normal and deviant groups and not for 
differentiating within the deviant group. He goes 
on to say that for this purpose a new group of 
tests is needed, perhaps of the word association 
and other projective types. It might be added 
that these same tests are also necessary for dif- 
ferentiating within the normal group, as in per- 
sonnel selection. 


Il. The Use of Projective Techniques 





¥ or purposes of this study, the investigator 
feit that the projective test could best measure 
the personality dimension to be submitted to ex- 
perimental analysis. He considered the individ- 
ual’s need for a unitary, consistent, and un- 
threatened personality organization to be a dyn- 
amic mechanism designed to establish a state of 
equilibrium. A survey of available testing tech- 
niques seems to reveal that the projective meth- 
od is better able than the conventional personal- 
ity inventory to sample the personality organiza - 
tion. 


IV. Characteristics of Projective Techniques 





Freud was the first to use the term ‘‘projec- 
tion.’’ However, as currently used in projective 
tests, projection means more than a defensive 
function, according to Bell;? it is also an expres- 
sive function. Bell ® gives the Latin derivation 
as ‘‘to cast forward, ’’ which is the action involved 
in the technique. He points out 9 that the purpose 
of projective techniques is to gain insight into 
the individual’s behavior. This also is the pur- 
pose of other personality tests; however, projec- 





REED 291 


tive tests are global in their approach in contrast 
to the atomistic approach which centers its atten- 
tion upon traits of the personality considered as 
disparate items. As Harriman states: 


The purpose of these procedures is to ob- 
tain an insight into values, wishes, re- 
pressions, emotional organization, and so 
on, which the individual might be unwilling 
or unable to supply if the direct-question 
method were used. 10 


There are three characteristics of projective 
techniques, according to Bell; 11 


1. Presentation of a stimulus to the subject 
which does not make manifest, or partially makes 
manifest, the real purpose of the examiner. 

2. Sampling individual behavior in a structured 
event of sufficient brevity to be clinically practic- 
able and of sufficient stimulation to call forth a 
wide range of individual responses. 

3. Consideration of the recorded behavior, as 
well as the personality that produces ft, as an 
organized totality. 


The purpose of projective tests, according to 
Korner, 12 is not to predict reality behavior; tests 
merely reflect secondary configurational patterns. 
He cautions test users to realize that tests mere- 
ly record behavior, and all behavior manifesta- 
tions are expressive of an individual’s personal- 
ity. The scoring of a test is merely the examin- 
er’s ‘‘shorthand’’ used to reduce behavior to man- 
ageable proportions; and his clinical insight or 
judgment is only by inference based upon famil- 
iarity with behavior dynamics, 13 

Bray, 14 on the other hand, has stated that 
testers have seldom even attempted to predict 
behavior from their test results. The apparent 
conflict between Korner’s and Bray’s positions 
seems to revolve around the use of tests as pre- 
dictors of behavior or as diagnostic devices from 
which one may infer certain dynamic tendencies 





Se Ae R. Rhode, "Explorations in Personality by the Sentence Completion Method," Journal of Ap- 


plied Psychology, XXX (April 196), p. 169. 
6. Zubin, op. cit., p. 59. 





7. Je E. Bell, Projective Techniques (New York: Longmans, Green and Co., 1948), p. 2. 
8. Ibid., p. 3 


9. Ibid., p. he 


10. P. L. Harriman, The New Dictionary of Psychology (New York: Philosophical Library, 1917), p. 





270. 
ll. Bell, op. cit., pp. l-6. 


12. A. F. Korner, "Theoretical Considerations Concerning the Sc 
f 


Techniques," 





and Limitations of Projective 


ope 
and Social Psychology, XLV (1950), p- 623. 





292 JOURNAL OF EXPERIMENTAL EDUCATION 


through clinical judgment. 

It would seem that different stages in the pro- 
cess of test construction are the occasion for this 
disagreement. The second step after the criter- 
ion selection in the development of prediction in- 
struments, according to Horst, 15 is the assemb- 
ling of data on a group representative of the pop- 
ulation for which predictions are to bemade. The 
nature of these data will be controlled by the 
tentative hypotheses held regarding the relation 
between the criterion and various items. Korner 
was apparently thinking of this stage of the devel- 
opment of predictive measurements. 

Horst then goes on to formulate the final three 
steps of combining the data to give a total predic- 
tive score, trying out the results on a check 
sample, and finally, if the results hold up, ac- 
tually using the test as a predictive instrument. 
Bray would say that the validation step, or try- 
out on a check sample, is the important function 
of a test. 

Bell 16 further clarifies the functions of per- 
sonality tests, and projectives in particular, by 
stating that projective devices serve two main 
functions; (1) the offering of rapid, valid, and 
reliable means by which a clinician may arrive 
at a picture of the personality of a subject; and 
(2) the facilitating of personality studies in psy- 
chological research. It is with this latter func- 
tion of projectives that this investigator is con- 
cerned, He is using a projective device to meas- 
ure an aspect of personality and to determine any 
possible relationship that that aspect of person- 
ality may have with teacher effectiveness. 


V. Criteria for the Adequacy of a Projective 
Technique 





According to Bell, 17 any projective technique 
must meet the following criteria: 


1, The first is that the technique must stimu- 
late behavior by the subject in which the differ- 
ent layers of the personality may be manifested 
and, as much as possible, distinguishable. 





(Vol. XXI 


2. ....the stimulus materials used must be 
simple and readily available. 

3. ....the method must not consume more 
time than is proportionate to the value of the in- 
formation received. ... 

4. The technique should be easy to adminis- 
eee 

5. The method must be reliable in the sense 
of being able to produce records from an individ- 
ual which are psychologically consistent. ... 

6. The interpretations based on the records 
must be valid.... 

7. The techniques should not produce major 
disturbances in personality functioning or act as 
precipitating factors to maladjustment. 


VI. The Sentence Completion Test as a Projec- 
tive Technique 





The types of projectives seem to be limited 
only by the ingenuity of the experimenter. They 
are classified in various ways. The most com- 
mon classification is in terms of the amount of 
structuring in the stimulus. A lump of modeling 
clay represents the completely unstructured med- 
ium, and a photograph represents the structured 
type. Whatever medium is used should give free 
scope to action and should provide the widest pos- 
sible latitude in choice of response or forms of 
expression, according to Symonds. 18 Frank 19 
has classified the media into four types on the 
basis of response: the constitutive, constructive, 
interpretive, and cathartic. Under this system, 
the sentence completion test would be classified 
as a constructive type in which the subject organ- 
izes separate meaningless parts into meaningful 
wholes. 

The most important feature of a projective tech- 
nique, as revealed by an analysis of the research 
evidence, is not the type of stimulus provided or 
response given to it, but the interpretation which 
is made of the response. It would seem, there- 
fore, that the most important consideration ‘or 
the selection of a projective device would be its 
use and the extent to which the medium meets 





Ibid., p. 619. 


De We Bray, "The Prediction of Behavior from Two Attitude Scales," Journal of Abnormal and So- 


cial Psychology, XIV (1950), pe Ou. 





>. Horst, The Prediction of Personal Adjustment (New York: Social Science Research Council, 





19h1), pPe ly Oe 
Bell, op. cit., pe 49h. 


Tbid., pp. Lol-5. 


Pe Me Symonds, "Projective Techniques," Encyclopedia of Psychology, Edited by ?. Le Harriman 
(New York: Philosophical Library, 1946), p. : ao 


Le Ke Frank 
(1939), pe h03. 


"Projective Methods for the Study of Personality," Journal of Psychology, VIII 








June, 1953) 


the criteria proposed by Bell, as quoted in the 

preceding section. Up to the present the most 
extensive application of projectives has been in 

the diagnosis of deviant personalities. Increas- 
ing use of the technique with normal people has 
been noted. 

The sentence completion type of projective 
technique allows the experimenter to sample the 
subject’s projection of his personality without the 
subject knowing what dimension is being meas- 
ured. At the same time, the items may be struc- 
tured in such a way that the experimenter can 
score or evaluate the responses by some prear- 
ranged plan. This advantage according to 
Forer, 29 allows for a consistent approach to the 
test material. When the stimulus is minimally 
structured, the interpreter lacks sufficient in- 
formation to determine what the response 
means, 21 

It can be concluded from Forer’s study of the 
structured form that the sentence completion test 
can be used for a variety of purposes with reas- 
onable certainty that the responses will reveal 
attitudes or dynamics in the areas intended. Sy- 
monds 22 expressed the same optimism for pro- 
jective techniques generally. 

The sentence completion technique seems to 
be appropriate to this study, which can be called 
an attitude test as well as a controlled projection 
test. Allport has characterized an attitude as a 
‘state of readiness which exerts a dynamic influ- 
ence upon the individual’s responses to all objects 
and situations with which it is related. ''23 This 
study hypothesizes an attitude of acceptance of 
‘‘objects and situations’’ as a ‘‘dynamic influence 
upon the behavior of the teacher.’’ There is, 
therefore, in Forer’s terms, a preconceived plan, 





REED 293 


and the sentence completion test items and scor- 
ing can be structured to meet that demand. 


Vil. The Use -{ the Sentence Completion Test in 
Other Studies 


Payne’s 24 and Tendler’s 25 studies are gerier- 
ally considered to be the first attempts to use the 
sentence completion technique for personality di- 
agnosis. Payne used fifty items in his sentence 
completion test as a personality measure in vo- 
cational counseling. Tendler distinguished be - 
tween the diagnosis of thought reactions and of 
emotional responses. His criteria, scoring tech- 
niques, and rationale have contributed a great 
deal to subsequent studies. Tendler’s efforts ap- 
plied the same technique to emotional factors 
which had been introduced by Trabue 26 as a lang- 
uage scale in 1916, and even earlier by Ebbing- 
haus as a test of intelligence. Guilford 27 indi- 
cated recently that some return to those tech- 
niques was inorder. ‘‘I do not now see how some 
of the creative abilities, at least, can be meas- 
ured by means of anything but completion tests of 
some kind, ’’ 


Tendler used twenty incomplete sentence items 
with 250 college girls and validated the testagainst 
autobiographical character sketches and the Wood- 
worth Personal Data blank. The items were in- 
tended to stimulate admiration, anger, love, hap- 
piness, etc. 

Little research was done with this device un- 
til World War I. Hutt28 reported on the use of 
the sentence completion technique as a supple- 
ment to data acquired by the Speer through 
other techniques. Holzberg,29 and Holzberg, 





20. Be 1. Forer, " 
(1950), pe 16. 


Ibid., 


De 17- 


e Symonds, "New Directions for 


II (December 1949), pe 337. 


4 


ASS: 


port, "\ttitudes," in llandbook of Jocial 
> , 
1 


syc! ~+y, Carl iiurchison, Editor (.Jorces- 
. . ——e _ 
ark University .’reSs, 1755), Pe 199. 


.rojective Techniques, XLV 


; .'sycholo yy 


A. F. Payne, Sentence Completions (New York: Guidance Clinic, 1928). 





A. D. Tendler, "A 
Psychology, XIV 


we tT 


1930), ppe 123-136. 


Preliminary Report on a Test for Emotional Insight," Journal of Applied 


i Stand 


M. Re Trabue, Completion-Test Languare Scales (New York: Teachers College, Columbia Univer- 





sity, 1916), pe 115. 


J. Pe Guilford, "Creativity," (President's Address, 


tember 1950), Pe LUS. 


Hutt, op. cite, pp. 134-110. 


Je De Holzberg, "Some Uses of Projéctive Techniques 


ger Clinic Bulletin, IX (19L5), pp. 89-93. 





AVA, 1950) American Psychologist, V (Sep- 





in Military Clinical i’sychology," Mennin- 






















294 JOURNAL OF EXPERIMENTAL EDUCATION 





Teicher, and Taylor 30 also reported the use of 
the sentence completion techniques of Tendler 
and Shor as « part of a diagnostic battery in mil- 
itary > power hospitals. 

Shor 41 introduced a variation to the sentence 
completion device by structuring the items to 
elicit responses to the common experiences of 
the soldier. This was the first attempt to use 
the device in a particular situation. He also ar- 
ranged his fifty items in a ‘‘definite sequence to 
permit a carryover or generalization of attitude 
from immediate to basic human interest. ’’32 

Stein 33 and Symonds 34 used this technique in 
the Office of Strategic Services as a personnel 
selection device. Stein’s test sampled relevant 
information <oncerning at least ten areas consid- 
ered to be important for personality evaluation. 
Symonds use‘ two tests of fifty items each. Sy- 
monds conclided, even though he used only eight- 
een subjects; 


....the sentence completion test cannot 
be used to differentiate good and bad ad- 
justment by any direct comparison of it- 
ems or by psychometric methods. The 

sentence completion is descriptive and 

not evaluative. 35 


Rotter and Willerman, 36 on the other hand, used 
a forty-item test with patients in an AAF Hospital 
and claimed a validity of +. 61 against psychiatric 
judgment of severity. 

Rohde, 37 Rotter, Rafferty and Schachtitz, 38 
and Wilson 39 have applied the sentence comple- 
tion technique to the detection of emotional mal- 
adjustment in students. Rohde modified Payne’s 
list and used sixty-four items in a validation study 
of 100 ninth-grade students with teachers’ opin- 
ions, interviews, and Evidence Record Data as 
the criterion measures. Using Murray’s system 
of interpreting behavior reactions, she scored 








(Vol. XXI 





the test on 39 personality categories classified 
under needs, press, and inner states. Remark- 
ably high correlations were reported, the aver- 
age being . 78 for the 50 girls and . @2 for the 
boys. 

The standardization and validation study by 
Rotter, Rafferty, and Schachtitz on college stu- 
dents introduced a unique system of scoring by 
example, and the experimenters concluded that 
such a system introduced the possibility of util- 
izing the sentence completion test for a number 
of screening problems. Their test was validated 
against 82 females classified as adjusted or mal- 
adjusted by instructors and advanced student clin- 
icians; and 124 males, 78 of whom were classi- 
fied by their instructors and 46 who were refer - 
red, or were self-referrals, to the Psychological 
Clinic. Correlation coefficients between test 
scores and classification by the teachers only as 
adjusted or maladjusted yielded biserial correla- 
tions of .50 for the females and . 62 for the males. 


Wilson used forty items structured to school 
situations and validated the test against seven 
maladjusted boys and girls and fifteen well-ad- 
justed boys and girls in the tenth and eleventh 
grades. A study of the formal aspects of the test 
showed no consistent differences. However, on 
some items she noted some significant differences. 
The maladjusted students felt that the rules were 
too strict and the examinations unfair, while the 
adjusted students felt the rules were not too strict 
and the examinations were hard but necessary. 


Kelly and Fiske 40 have released a prelimin- 
ary report on the evaluation of clinical psychol- 
ogy trainees, in which 78 of 128 P-1 first-year 
trainees were assessed during the spring of 1947 
in a one-week program with a battery of tests 
and interviews. An evaluation of their work was 
made at the end of their second year by the Uni- 
versity and hospital installation supervisors on 











30. Je De Holzberg, A. Teicher, and J. L. Taylor, "Contributions of Clinical i’sychology to :ili- 


tary reg pape in an Army Psychiatric Hospital," Journal of Clinical 
i? 9 


(January 19 pp. 3-95. 


’Sycholory, Lil 





31. J. Shor, "Report on a Verbal Projective Technique," Journal of Clinical Psychology, Il (1946), 


PPpe 279-282. 


Toide pe 280. 








33. Me Ie Stein, "The Use of a Sentence Completion Test for the Diagnosis of .’crsonality," Jour- 





nal of Clinical Psychology, III (1947), pp. 47-56. 





Ble Pe Me Symonds, "The Sentence Completion Test as a Projective Technique," Journal of Abnormal 





Tbid.,p. 321. 


and Social Psychology, XLII (July 1947), pp. 320-329. 





36. Je Be Rotter and B. Willerman, "The Incomplete Sentences Test as a Method of Studying Person- 
ality," Journal of Consulting Psychology, XI (1947), pp. 43-8. 





37. Ae Re Rohde, "Explorations in Personality by the Sentence Completion Method," Journal of Ap- 


plied Psychology, XXX (April 1946), pp. 169-181. 













June, 1953) 


(1) skill in diagnosis, (2) skill in individual psy- 
chotherapy, (3) skill in research, and (4) prefer- 
ence for hiring. Of the four projective tests used, 


....the one showing the most promising 
validities is the Sentence-Completion Test, 
with which our projectivists had had but 
little previous experience. Furthermore, 
the Sentence-Competion Test and the The- 


matic Apperception Test were interpreted 
‘blind, ’°41 


VII. The Use of Projective Tests in Evaluation 
of Teachers 





A study by Alexander 42 has indicated that a 
projective test can predict ways in which teach- 
ers interact with children as revealed by obser- 
vational data. Eight pictures of the TAT type 
were used, showing teacher-pupil relations. Al- 
exander concluded that one can predict ways of 
behaving and that these predictions have close 
agreement with observed behavior. 


IX. Summary 


1. The understanding of personality measure- 
ment is bound up with the underlying theoretical 
conception of personality. Projectives are based 
upon a dynamic conception of personality rather 
than a static process. 

2. Personality is structured and all experiences 
are integrated into a pattern consistent for the in- 
dividual. 

3. Behavior is functional and the personality 
structure reveals itself in the behavior of the in- 
dividual consistent with his concept of himself. 
Logical consistency of behavior may be present, 
psychological consistency is always present. 

4. Personality is a depth phenomenon. Sur- 
face manifestations as revealed in observable 
and controlled situations make possible inferences 
regarding the latent structure and content. 

5. Projective techniques are attempts to ex- 
plore the nature of the latent structure and con- 
tent of the personality in order to predict overt 
behavior. 





295 


6. It now seems possible to construct a meas- 
uring device appropriate to a variety of clinical, 
applied, and experimental purposes. 

7. The sentence completion type of projective 
technique allows for freedom of response, and 
at the same time the stimulus can be sufficiently 
structured to permit more meaningful interpre- 
tations of the responses. 

8. Administration of the test is relatively 
simple; it is not time-consuming, and no special 
training is ordinarily necessary. 

9. The sentence completion method now lends 
itself easily to objective scoring for screening 
or experimental purposes. Personality analysis 
for clinical appraisal and interpretation requires 
the same general skill and knowledge of person- 
ality as is necessary for other projective tech- 
niques. 

10. Even though the subject has a greater op- 
portunity for disguise of purpose in the sentence 
completion test than in other projective devices, 
the advantages of partial control of the response 
allows for meaningful interpretations in certain 
experimental situations. 

11. The reliability of responses and scoring 
is not high but it is within the limits of accepta- 
bility. The validity of the sentence completion 
method has not been high enough to eliminate cor- 
roborative data, but objectification of scoring has 
recently improved the validity. 


SECTION VI 


CONSTRUCTION OF SENTENCE COMPLETION 
TEST 


I. Selection of Items 





IT WAS first necessary to formulate a well- 
defined purpose and objective for the sentence 
completion test (hereafter referred to as SCT) as 
a whole. The hypothesis for this study, as prev- 
iously stated, was to determine the relationship 
between the teacher’s attitude of acceptance and 
his teaching efficiency. The SCT was thought to 
be the best device for measuring this attitude. 





38. Je Be Rotter, J. 


Ie Rafferty, and E. Schachtitz, "Validation of the Rotter Incomplete Sentences 


Blank for College Students," Journal of Consulting Psychology, V (1949) pp. 348-356. 





39. I. Wilson, "The Use of a Lentence Completion Test in Differentiating Between /ell-Adjusted and 
Naladjusted Secondary School Pupils," Journal of Cons.lting Psychology, XIII (December 1949), 





ppe 400-402. 


0. E. Le Kelly and D. J. Fiske, "The Prediction of Success in the V. A. Training Program in Clin- 
ical Psychology," American Psychologist, V (August 1950), pp. 395-006. 





kl. Ibid., Pe Ol. 


2. T. Alexander, Jr., "The Prediction of Teacher-)upil Interaction with a Projective Test," Jour- 
nal of Clinical Psychclogy, VI (July 1950), pp. 273-276. 








296 JOURNAL OF EXPERIMENTAL EDUCATION 


Previous studies had established this technique 
as a valid instrument. It also seemed possible 
to structure the items sufficiently to allow for 
meaningful responses without destroying the sub- 
ject’s freedom of response. 

The search of the literature revealed few stud 
ies which had been made in which the experiment 
er was using the technique for the measurement 
of a specific attitude and with a particular occu- 
pational group. Forer pointed out how this could 
be done. Furthermore, most of the studies have 
been concerned with the measurement of person- 
ality for clinical and diagnostic purposes. Rotter 
et al., have formulated an objective and econom- 
ical scoring system for screening and experimen- 
tal purposes. 

In this study the investigator established the 
hypothesis that the effective teacher was one whose 
personality organization was such that he could 
accept into his self-concept and phenomenological 
environment experiences perceived to be enhanc- 
ing as well as threatening. The ineffective teach- 
er, on the other hand, would have a tendency to 
reject the threatening experiences and accept only 
the enhancing ones. In order to measure the tea- 
cher’s personality organization on this dimension 
of acceptance-rejection, it was necessary to ob- 
serve several principles common to sentence com 
pletion tests in general and to this SCT in partic- 
ular. 


1. The stimulus items, or phrases, should 
elicit an emotional response rather than a thought 
response. 

2. They should elicit many responses of a dis- 
criminating nature. 

3. They should stimulate the subject to pro- 
ject his self-concept into the responses. 

4. They should, in part, be structured to elicit 
responses peculiar to the educational process. 

5. They should elicit responses which can be 
scored in a reliable and valid manner. 


It was found that the incomplete sentence words 
or phrases used in the tests of Forer, Tendler, 
and Rohde met the five principles above. The in- 
vestigator felt at the same time that the SCT should 
contain more items designed to elicit responses 
which would satisfy the fourth principle—-educa- 
tional situations. 

The list of 91 items finally selected for the 
SCT were made up from the following sources: 
(See Appendix C in original thesis) 


B. R. Forer 
Inyestigator 


A. D. Tendler........... 





(Vol. XXI 


Forer had classified his 100 items under: (1) 
inter-personal figures, (2) dominant drives, (3) 
causes of own emotional responses, and (4) re- 
actions to emotionally stimulating situations. A 
sample of each category was selected for the SCT. 
The 20 items constructed by the experimenter 
were in large part structured to elicit attitudes 
toward students, teachers, parents, and school 
situations. A few items were suggested by other 
experimenters and writers. Rohde’s items elic- 
ited reactions to other people, authority, work, 
future situations, and sources of pleasure, envy, 
and ambition. Two of Tendler’s items sampled 
sources of annoyance, and one sampled the source 
of satisfaction. 

The selection of these 91 items was done pri- 
marily on an empirical basis as a result of the 
experience of others. The determination of wheth- 
er these items would measure the hypothetical 
personality organization was theoretical. Only 
through a process of standardization of the scor- 
ing system could it be determined which item 
responses were discriminating ones. It will be 
seen in the rest of this section that 26 of the 91 
items could be used as significant ones in the val- 
idation process. 


Il. Preliminary Assumptions for Scoring System 





A review of the studies using the projective 
technique in personality measurement revealed 
certain phenomena which may impair the validity 
of this technique, or may increase its validity if 
the investigator can take advantage of them. In 
either event, one must be cognizant of them in 
order correctly to score and interpret projective 
data. 

1. The investigator must have some assurance 
that the subject is revealing his true self, or at 
least the dimensions of personality that are being 
examined. One of the chief criticisms of the 
paper and pencil objective techniques has been 
the lack of assurance that the subject was respond- 
ing to the test stimuli in a manner consistent with 
this conception of himself. If the examiner wished 
to consider this discrepancy as a characteristic 
of the subject’s behavior, there was little that 
could be done about it. Some of this characteris- 
tic might be revealed in a discussion of test re - 
sults with the subject. However, this evaluation 
would be quite subjective. The proponents of pro- 
jective techniques claim that they are able to 
take advantage of the subject’s efforts to ‘‘look 
good’’ in the test situation. 

McGinnies ! has called attention to the distinc- 
tion between perception as occurring at the level 
of implicit response, and reaction as overt ob- 
servable response. This investigator contends 
that perception is one way of responding and the 





1. E McGinnies, 
Social Psychology, XLV (1950), p. 28. 


"Personal Values as Determinants of Word Association," Journal of Abnormal and 

















June, 1953) 


perception of a stimulus may delay or even pre- 
vent reaction by raising the perceptual threshold. 
In any attempt to measure an aspect of personal- 
ity organization, the investigator would the re- 
fore need to anticipate this possibility and adjust 
his technique to it. 

Spencer 2 found that among 192 high school 
students, 22 percent of the subjects admitted that 
they would have left some of the questions in an 
experience appraisal blank unanswered if their 
signatures had been requested, and about 9 per- 
cent said they would have answered some of the 
questions untruthfully. Spencer observed that 
the latter group also had the highest average con- 
flict score. If such evasion is typical of apprais- 
al blank responses, it is necessary for the inves- 
tigator to allow for it in his scoring of any device, 
or preferably to use it as a significant factor. 
The flexibility of projective techniques has been 
suggested as one of the advantages of that meth- 
od 


Another aspect of the problem raised by Mc- 
Ginnies was investigated by Carter, 3 who found 
that changes in palmar skin conductivity and re- 
action time as measured by the galvanometer 
were significantly greater in individuals with 
problems and in psychoneurotics than in normals. 
However, the oral responses of the control and 
experimental groups varied little on a modifica- 
tion of the Tendler Emotional Insight Test. This 
would seem to indicate that the perceptual thresh- 
old for verbal stimuli was higher than that for 
physiological stimuli, and, therefore, devices 
for measuring reactions to verbal stimuli were 
not so effective as those measuring reactions to 
physiological stimuli. Carter does not, however, 
indicate how the investigator would determine the 
source of the neurosis or the specific problem 
troubling the subject. 

2. A problem related to that above is the one 
of clinical judgment and clinical intuition as con- 





REED 297 


sidered by Klehr, 4 who showed that clinical judg- 
ment, which is not entirely intuitive, was com- 
parable to objective scoring. Fifteen experienced 
clinicians and a control group of graduate students 
were used to measure the scatter patterns on an 
equal number of normals and two groups of clin- 
ical categories. Both the clinicians and graduate 
students demonstrated results which were signif- 
icantly better than chance. Klehr attributed this 
ability to training and experience. 

3. A problem in projective techniques, and 
especially in the sentence completion form, is 
the determination of whether the subject reveals 
more about himself when he is the subject or 
when others are the subject. McGinnies 5 con- 
cluded that a subject will respond sooner to a 
word symbolizing his highest value area than he 
will to a word symbolizing his lowest value area. 

Sacks 6 found that the first-person form yield- 
ed five out of six significant differences, in his 
study of one hundred Veterans Administration 
patients. This may be consistent with the ‘‘high- 
est value’’ words as noted by McGinnies in that 
the person’s highest value is himself. 

4. The influence of response sets upon the val- 
idity of personality tests has been examined by 
many investigators. Cronbach7 has defined a 
response set as ‘‘any tendency causing a person 
to give different responses to test items than he 
would when the same content was presented in 
different form, ’’ and has found 8 that response 
sets are most influential in those situations where 
the subject is allowed to define the situation, and 
that they should, therefore, be avoided except in 
projective devices which capitalize on ambiguity. 
He makes a further allowance for projectives by 
noting that response sets are to a small degree 
correlated with external variables such as atti- 
tudes, interests, and personality. 

Rundquist 9 seemed to discredit the effect of 
response sets in a study in which 111 factory girls 





D. Spencer, "The Frankness of Subjects on Personality Measures," Journal of Educational Psy- 


chology, XXIX (1938), ppe 26-35. 





H. J. Carter, "A Combined Projective and Psychogalvanic Response Technique for Investigating 
Certain Affective Processes," Journal of Consulting Psychology, XI (1947), pp. 270-275. 





H. Klehr, "Clinical Intuition and Test Scores on a Basis of Diagnosis," Journal of Consulting 


Psychology, I (1949), pp. 34-38. 





E. McGinnies, "Personal Values as Determinants of Word Association," Journal of Abnormal and 


Social Psychology, XLV (1950), pp. 28-56. 








J. M. Sacks, "The Relative Effect Upon Projective Responses of Stimuli Referring to the Subject 
and of Stimuli Referring to Other Persons," Journal of Consulting Psychology, XIII (1949), pp. 





12-20. 


. Le Je Cronbach, "Response Sets and Test Validity," Educational and Psychological Measurement, 


VI (1946), pe f75. 





L. Je Cronbach, "Further Evidence on Response Sets and Test Design," Educational and Psycho- 


logical Measurement, X (1950), p. 21. 











298 JOURNAL OF EXPERIMENTAL EDUCATION 


were asked to indicate how well 200 descriptive 
words and phrases applied to them, and immed- 
iately afterward how well they liked or disliked 
each of 100 activities. He found a consistency 
represented by a correlation of .4, and conse- 
quently doubted that response sets revealed any- 
thing basic about the individual. Rundquist indi- 
cated that ‘‘the correlation was largely a function 
of the type of material, directions, mood, or 
some other temporary condition. ’’ 

The present investigator has attempted to 
measure the subject’s tendency to project a 
certain attitude —acceptance or rejection—in 
several situations as structured by the stimulus 
phrases of the incomplete sentences. He has al- 
so been aware of the other problems raised above, 
and while the answers are still hypothetical, he 
has proceeded on the assumption that his test 
design represented feasible possibilities. An ex- 
amination of the findings may confirm them or 
may indicate better procedures. 


I. Criterion for Standardization of Scoring 
System 





The students’ evaluations of the 21 teachers 
at School I on Scale A, teaching effectivness, 
were used as the criterion measure for standard- 
izing the scoring system. Scale A was selected 
because the investigator felt that it was the only 
one of the three scales that could be supported by 
previous experimental evidence; and also because 
it would have the greatest applicability for screen- 
ing purposes. A high correlation between the 
test scores and Scales B and C would have exper- 
imental significance, but little or no practical 
significance unless it were known how the same 
sample of teachers were evaluated on their teach- 
ing ability. The student evaluations were used 
in preference to administrator ratings because 
the primary purpose of this study was to examine 
the relationship between the teacher’s attitude of 
acceptance, or permissiveness, and his teaching 
effectiveness in the classroom. Those mostcon- 
cerned with this relationship and in the best posi- 
tion to evaluate it were the students. 





(Vol. XXI 


IV. Review of Rationale 





The rationale for this study as contained in 
Section II has been drawn from the contributions 
of those who contend that the individual’s person- 
ality is organized around his attempt to maintain 
a state of balance or unity that is consistent with 
his concept of himself and his perceptual envir- 
onment.10 Emotional states are related to goal- 
directed behavior, and the kind and intensity of 
the emotion are related to the perceived signif- 
icance of the behavior for the maintenance and 
enhancement of the organism. The unpleasant 
and/or excited feelings accompany the goal-seek- 
ing effort of the individual, and the calm and/or 
satisfied emotions accompany satisfaction of the 
need. 

Those experiences which tend to enhance the 
organism are assimilated or accepted intoa con- 
sistent relationship with the concept of self; those 
experiences whichare inconsistent with the org- 
anization of self are perceived to be threats, and 
hence the organism tends to reject them or to de- 
fend itself against them. Under certain conditions 
involving the absence of threat, experiences which 
are inconsistent with the self-concept or self - 
structure may be accepted through a revision of 
the self-structure. When the individual perceives 
and accepts into one consistent and integrated sys- 
tem allhissensory experiences, then he is neces- 


sarily more understanding and accepting of others. 

This attitude of acceptance is not, according 
to Prescott, 11 to be confused with blind, passive 
acceptance, which produces dependence, the op- 
posite of maturity. Sheerer 12 has said that an 
accepting person has internalized certain values 
and principles which serve as a general guide for 
behavior and relies upon this guide rather than 
conventions or standards of other individuals, and 
does not hate, reject, dislike, or pass judgment 
against others when their behavior seems to be 
in contradiction to his own. 

This rationale has been reduced in this study 
to the hypothesis that the accepting personis a 
more effective person, and the accepting or per- 
missive teacher is a more effective teacher, than 





9. E. Ae Rundquist, "Response Sets: 





A Note on Consistency in Taking Extreme Views," Education- 
al and l’sychological Measurement, X (1950), p. 98. 


10. C. Re Kogers, Client-Centered Therapy (New York: Houghton Mifflin Co., 1951), Ch. ll. 
D. Snygg and A. W. Combs, Individual Behavior (New York: Harper and Bros., 1949),Chs. and 5. 








ll. D. A. Prescott, Emotion and the Educative Process (Washington, D.C.: American Council on Edu- 





cation, 1938), p. 105. 


12. E. A. Sheerer, “An Analysis of the Relationship Between Acceptance of and nespect for Self and 
Acceptance of and Respect for Others in Ten Counseling Cases," Journal of Consulting Psychol- 


ogy, XIII (1949), pp. 170-171. 




















June, 1953) 


the non-accepting person or teacher. 


V. Rationale for Scoring Part 113 





The scoring system devised by Rotter and Raf- 
ferty served as the basis for scoring the SCT. 14 
Essentially their system consisted of scoring 
their test from examples contained in the manual. 
Each item response was assigned a weight from 
0 to 6 and an over-all score was obtained by to- 
taling the weights of each item. The scoring ex- 
amples were illustrative of certain principles, 
and were not intended to contain all possible sen- 
tence completions. 

Rotter’s principles involved a distinction be- 
tween conflict and positive responses. The con- 
flict responses were those indicating an unhealthy 
or maladjusted frame of mind. The responses 
were scored according to the severity of the con- 
flict or maladjustment expressed and they were 
assigned a weight of 4, 5, or 6. The positive re- 
sponses were those indicating a healthy or hope- 
ful frame of mind and they were scored 0, 1, or 
2, depending upon the degree of good adjustment. 
Between the conflict and positive responses were 
those Rotter designated as neutral ones, which 
did not fall clearly into either the positive or neg- 
ative. They generally lacked emotional tone or 
personal reference, or the responses were de- 
scriptive and were found to be characteristic of 
both the adjusted and maladjusted. The neutral 
responses were scored 3. 

In addition to the basic principles referred to 
above, Rotter established certain other principles 
to clarify the scoring of many questionable re - 
sponses which would inevitably arise in such a 
system. These principles were designed to cov- 
er omitted and fragmentary responses, those in 
which the subject adds a qualification, those re- 
sponses in which the subject expresses more 
feeling than indicated by the example, and those 
instances in which the subject gives an unusually 
long response. 

The main distinction between Rotter’s system 
of scoring and that of this investigator is in the 
definition of the scoring intervals, or the scoring 
principles. Rotter conceived of a scorable dif- 
ference in intensity of response within each of 
the positive and conflict types. This investigator 
defined the scoring intervals in terms of ego dis- 
tance. According to the rationale for this study 
established in Section I, there is a hierarchy of 





REED », 299 


situations or experiences in one’s environmental 
field, as they relate to the individual’s self-con- 
cept. In one respect, it is easier for an individ- 
ual to accept impersonal situations than other 
people. Likewise, it is easier for the individual 
to accept other people than it is to accept him - 
self. In another respect it can be assumed that 
it requires less ego strength for the individual 

to accept impersonal situations than it does to 
accept others and self. The reaction of the indi- 
vidual to a perceptual stimulus can be said to be 
in terms of his interpretation of that stimulus as 
threatening or enhancing. If he interprets it as 
threatening to his self-concept, he will reject it; 
if it is perceived to be enhancing, he will accept 
it. Whether the stimulus is reasonably interpret- 
ed as threatening or enhancing is assumed to be 
dependent upon the individual’s ego strength or 
the degree of his personality organization and in- 
tegration resulting from an acceptance of self. 

It was decided by this investigator to score the 
SCT responses according to the basic principles 
indicated above. An experience involving an im- 
personal situation was assumed to exist in the 
subject’s environmental field (as defined by Snygg 
and Combs) and therefore was thought to be less 
ego involving than those experiences involving 
other people. Likewise those experiences which 
directly involved the self-concept were potentially 
more threatening, or ego-involving, than those 
involving other people. 


The investigator set up a scoring system in 
which the individual’s attitude toward self re - 
ceived the maximum weight of 6 for self-accep- 
tance and 0 for self-rejection. It was assumed 
that a greater degree of ego strength or person- 
ality organization was involved in experiences 
requiring self-acceptance, and hence, it should 
receive a score of 6. Likewise, those exper- 
iences simulated in the stimulus phrases which 
were threatening to the self-concept and there- 
fore rejecting should receive the maximum pen- 


alty, or 0. On this same dimension, acceptance 
of an experience through the projected response 
which indicated an acceptance of others would 
receive a score of 1. Acceptance of an imper- 
sonal situation was assigned a weight of 4 and 
rejection of it a weight of 2. Those responses 
which Rotter called ‘‘neutral’’ were defined as 
ambivalent by this investigator and given a weight 
of 3. 





13. The investigator first attempted to score the 91 items for the 21 participating teachers at 


School I. 


It soon became apparent that 31 of the items could not be scored on the basis of 


the basic principles contained in this subdivision and subdivision V, and they were rejected, 
After the scoring system had been standardized for the remaining 60 items eas Part I, the 31 


items were examined again and became Part II of the SCT. 


parts is the same. 


However, the rationale for the two 


Uy. J. B. Lotter and J. E. Rafferty, Manual, The Rotter Incomplete Sentences Blank, College Form 





(New York: The Psychological Corporation, 1950), Ch. II. 





300 JOURNAL OF EXPERIMENTAL EDUCATION 


The seven-step scale, 0 to 6, used by this 
investigator was, therefore, similar to that used 
by Rotter with the exception of the definitions of 
the seven steps or intervals, and also the fact 
that the SCT was scored on a linear or unidimen- 
sional basis of acceptance. It was also assumed 
that the scale was an equal-interval one. There 
seemed to be no experimental evidence to the ef- 
fect that the distance between self-acceptance 
and acceptance of others was more significant 
than that between acceptance of others and accep- 
tance of situations. 


VI. Scoring Principles for Part I 





With this basic frame of reference, certain 
scoring principles were established. 


1. Attitudes toward self 
a. Responses to stimulus phrases which indi- 
cated that the subject was accepting him- 
self, or projecting a positive attitude about 
himself, were scored as 6. 
. Responses which indicated a self-rejecting 
attitude were scored 0. 


. Attitudes toward others 
a. Responses which indicated that the subject 
was accepting others were scored as 5. 
b. Responses which indicated a rejecting atti- 
tude of others were scored as 1. 


. Attitudes toward situations 
a. Responses which indicated an accepting at- 
titude toward impersonal situations or ex- 
periences were scored as 4. 
b. Rejecting responses toward situations and 
experiences were scored as 2. 


. Ambivalence 

a. There were no stimulus phrases inthe test 
which were structured to elicit an ambiv- 
alent or evasive response. However, if 
the subject chose to project an evasive re- 
sponse, or used a catch phrase, stereotype, 
or a song title, the response was evaluated 
as ambivalent and scored as 3. 

. The ambivalent response was not consid- 
ered to be inconsistent with the rationale 
outlined above. That area where the sub- 
ject crosses over from acceptance to re- 
jection was thought to be significant. The 
subject who uses an ambivalent response 
does so for some dynamic reason. 


As the investigator progressed with the stand- 
ardization of the scoring system, certain prob- 
lems presented themselves for which consistent 
answers were imperative. While it was desir- 
able to reduce the scoring of the SCT to a simple 
and objective basis, it was not always possible 





(Vol. XXI 


todo so. A certain amount of controlled judg- 
ment was necessary 


1. It was necessary to determine whether the 
responses should be scored objectively for their 
intrinsic meaning, or whether a subjective inter- 
pretation of latent meaning was indicated. It was 
determined that all responses should be scored 
on the one dimension under investigation—accep- 
tance. To interpret the responses otherwise 
would be unreliable for purposes of this investi- 
gation. Only one thought was kept in mind, frus- 
trating though it was, namely, is the subject ac- 
cepting or rejecting himself, others, or situa- 
tions ? 

2. Most stimulus phrases were structured to 
elicit a projected attitude toward self, others, or 
situations. It was noted that the subject would 
occasionally twist the response to include a ref- 
erence to self when normally it would be consist- 
ent to refer to others. ‘‘I feel that people.... 
like me’’ would be scored 6 for self-acceptance; 
however, the response, ‘‘....are interesting’’ 
would be scored 5 because the subject responded ' 
to the structure of the stimulus and made an ‘‘ac- 
cepting’’ response. 

3. Qualifications were often made by the sub- 
ject. The subject was free to respond in any 
way he desired, and the response had to be scored 
on the basis of what was said. In ‘‘The students 
in this school are. ... good considering their par- 
ents, ’’ the subject responded to ‘‘students’’ which 
the stimulus was structured to do, but the subject 
introduced ah attitude toward ‘‘parents’’ which 
was more emotionally loaded, and therefore it 
was scored 1 instead of 4 as it would have been 
had he not qualified it. 

4. Tense was not considered to be significant. 
‘“‘When people contradict me.... it used to make 
me furious’’ was scored as 0 even though it might 
have been reasoned that contradiction did not make 
him furious at the time he took the test. Again, 
the subject was free to say that contradiction did 
not bother him; however, he did not and, there - 
fore, the response was scored as given. 

5. Cultural semantics were evaluated interms 
of current usage. ‘‘When I meet a parent....I 
expect most anything’’ was considered to be a re- 
jecting response and was scored 1. ‘‘Most wom- 
en act as though. ...they are demure’’ was scored 
3 for ambivalence. 

6. Responses with a religious quotation or im- 
plication were scored 3 for ambivalence. 

7. Omissions or fragmentary responses were 
not scored, although an allowance was made for 
them with a correction factor as used by Rotter, 
8 = wo x Score, in which the total score (S) 
was equal to the number of items (N) divided by 
the number less the omissions times the obtained 
score. 








eS 


June, 1953) REED 


All items in the SCT did not elicit responses 
which were representative of each of the seven 
intervals in the scoring scale. However, Item 
32 did contain the following sample responses: 


32. When people contradict me.... 


6. I don’t mind it too much. 
I have learned to accept it. 
I am willing to reconsider my views. 
Iam ready to defend my statement. 


. I try to find out if they have a good reason. 
I feel it is their right. 


. Lattempt to find out why. 


. It is usually disconcerting, sometimes 
stimulating. 


. Lignore it if courteously done. 


. It makes me more positive in maintaining 
my position. 
It places me on the defensive. 
I stop talking. 
I don’t like it. 
It is apt to irritate me. 
I become angry. 
I become embarrassed. 
It makes me feel funny. 


Vil. Rationale and Scoring Principles for Part 
of SCT 


The 31 items which had been difficult to score 
with the original principles which were effective 
for the 60 items in Part I were examined again. 

It was desired to use the same rationale and there- 
fore it was necessary to devise a different set of 
scoring principles. 

The investigator divided the teachers in the 
standardization group, School I, into the effective 
and ineffective teachers according to the students’ 
evaluations on Scale A, teaching effectiveness. 
He then noted the responses to the 31 items which 
were given by each group. In general, the items 
were structured to elicit projected drives and 
sources of satisfaction and annoyance. The effec- 
tive teachers’ responses on some of the items 
seemed to differ from those of the ineffective 
teachers’ responses in several respects. (1) The 
responses of the effective teachers were consid- 
ered to be legitimate or acceptable reaction feel- 
ings. (2) They showed ego strength and accept- 
ance of responsibility without being self-reject- 
ing. (3) They demonstrated acceptance of self 





301 


and others, while the responses of the other 
group were more critical and showed a tendency 
to moralizing. (4) The responses of the effective 
teachers seemed to reflect a broader and more 
abstract phenomenological self without being evas- 
ive, while the other teachers were revealing a 
more concrete and personalized structure in a 
negative direction. 

Those responses which were characteristic of 
the teachers evaluated above the median on Scale 
A were given a plus score, and those responses 
characteristic of the teachers below the median 
were assigned a negative score. An inspection 
of the scores for the 21 teachers on the 31 incom- 
plete sentences revealed a significant trend on 
13 of the items. 

The plus responses were interpreted as char- 
acteristic of effective teachers or well-integrated 
and accepting personalities. This criterion of 
effective behavior seemed to be similar to the 
principles established by Tendler.15 He inter- 
preted the responses to his test on the basis of 
positive and negative ego or social reference. 
The ‘‘ego positive’’ responses contained an as~ 
sertive reference while the ‘‘ego negative’’ was 
self-depreciating. The ‘‘social positive’’ dem- 
onstrated an interest and feeling for others while 
the ‘‘social negative’’ responses revealed a fault- 
finding attitude. 

This investigator found that the effective tea- 
cher responded to the phrase ‘‘I pity....’’ with 
statements such as ‘‘....the boy; poor people; 
the unfortunate.’’ The ineffective teacher was 
more apt to respond with ‘‘....a selfish person; 
the poor teachers; a person with no self-reliance. ’’ 
The stimulus phrase ‘‘I was most depressed when 
‘ ’ elicited positive responses from most of 
the effective teachers, such as ‘ .. out of work; 
I felt I was not teaching properly.’’ The negative 
responses were more concrete and trifling: ‘‘.... 
I heard news of the election; the students didn’t 
care to learn; the fog came. ’’ 

This investigator believed that these positive 
responses reflected a better personality organ- 
ization and a stronger self or ego-structure than 
those responses judged to be negative. He also 
believed that the positive responses were elicited 
by dynamics similar to those scored as accept- 
ance responses in Part I. 


VIII. Reliability of SCT 


The common methods of establishing reliabil- 
ity —test-retest and split halves—were not con- 
sidered to be appropriate to this study. The var- 
ious aspects of personality and tendencies to be- 
have are more readily modified by experiences 





15.' A. De Tendler, "A Preliminary Report on a Test for Emotional Insight," Journal of Applied )’sy- 
chology, XIV (1930), ppe 122-136. 








302 JOURNAL OF EXPERIMENTAL EDUCATION 


and fluctuations in perception. The test-retest 
technique would measure changes in behavior 
rather than reliability. This investigator secured 
criterion and predictive measures at the same 
time, or reasonably so, in order to reduce the 
influence of temporal changes to a minimum. If 
this had not been done, the conclusions drawn 
from the experiment would have been less defen- 
sible, as one would then never know how much 
changes in experience and perception had influ- 
enced the results. Any significant correlations 
resulting from this study can,.therefore, be used 
to predict teaching effectiveness from scores on 
the SCT. 

The split-half technique was not considered 
to be appropriate since the items on an incom- 
plete sentence blank are not considered to be e- 
quivalent. Any attempt to establish equivalence 
would be on an a priori basis, and a high correl- 
ation of reliability would merely indicate that the 
two parts happened to be equivalent. The inves- 
tigator could not know why there was internal con- 
sistency. in fact, there is little justification for 
the split-half technique except that empirically 
it has been found that the two methods yielded ap- 
proximately the same results. Therefore, the 
split-half technique saves several weeks’ time 
consumed by the test-retest method. 

Reliability of scoring the SCT, however, was 
considered to be very important. While consis- 
tency on the part of the subject was not consid- 
ered to be « problem in projective personality 
measurement, it was necessary to establish in- 
ter-scorer reliability. The investigator deter- 
mined reliability of scoring in two ways: 


1. Phase I: Consistency of scoring 445 rep- 
resentative item responses for Part I between 
the investigator and five other scorers. 

2. Phase II; Consistency of total scores be- 
tween the investigator and two other scorers for 
Parts I and II of the SCT. 


For Phase I of the reliability procedure, a 
compilation was made of 445 representative re- 
sponses to the 27 items which met the inspection 
standards of significance as indicated in Section 
V. These scoring examples were to be used in 
the validation procedure of Schools II and II. It 
was not expected that every possible response 
to the incomplete sentences would be found in the 
responses of the teachers in School I. The scor- 
ing by exainple could only hope to establish by 
illustration the basic scoring principles. Judg- 
ment would be necessary in scoring those re- 
sponses for which there was no example. 

Each of the five raters who were selected to 
score the 445 sample responses had had some 
teaching and counseling experience. Three of 
the five were graduate students in clinical psy- 
chology. ‘Iwo of the five had had training in ed- 





(Vol. XxXI 


ucation primarily and some training in psychol- 
ogy. 

The five raters were given a list of the items 
to be scored with only the principles of scoring 
as contained in subdivision IV of this Section. 
They were asked to evaluate each response item 
and determine whether it should be scored 6, 5, 
4, 3, 2, 1, or 0. Each rater scored the items 
independently. 

Part I of Table II shows the degree of agree- 
ment between the six scorers, including the in- 
vestigator. All six scorers, on 255 responses 
or 57.3 percent, agreed that the item was either 
an accepting or rejecting one. If the response 
was scored as 6, 5, or 4 it was considered to be 
an acceptance response. If it was scored 2, 1, 
or 0, it was considered to be a rejection response. 
When the ambivalent score of 3 was added to 
either the acceptance range, 6-4, or the rejec- 
tion range, 2-0, the six scorers agreed on 136 
additional items, or 30.6 percent. There was 
disagreement on whether the response was pro- 
jecting an attitude of acceptance or rejection in 
54 items only, or 12.1 percent. There seemed 
to be little difficulty in the recognition of a differ- 
ence between an expressed attitude of acceptance 
and one of rejection. 

Part Ilof Table II shows the number of scor- 
ers who agreed on one interval score. All six 
scorers were in complete agreement on 181 re- 
sponses, i.e., all six scored the item as 6, 5, 
etc. Five out of six agreed on one intervalscore 
for 84 responses, At least three of the scorers 
agreed on 99 percent of the item responses. 

It was though that the range of scores assigned 
to each response would be indicative of the inter- 
scorer reliability. Part Ilof Table I shows that 
in 83. 8 percent of the scored responses there 
was not more than two points difference between 
the six scorers. _ 

The responses to the stimulus phrase ‘‘When 
people contradict me....’’ are illustrative of the 
differences noted between the six scorers. The 
response ‘‘....I wonder about their background,’’ 
was scored as a rejection of others and given a 
score of 1 by all scorers. Tothe response ‘‘.... 
I become embarrassed, ’’ five scorers assigned 
a score of 0, assuming that the response was a 
self-rejecting one. One scorer judged it to be a 
projection to a situation and therefore scored it 
as 2. The same scorer was alone on two other 
responses to the same stimulus phrase, ‘‘....I 
become angry”’ and ‘‘....it places me on the de- 
fensive.’’ He scored them as 2, while the other 
five scored them as 0. This difference in inter- 
pretation would account for the larger number of 
responses with a range of 2 points than of one 
point. The scorers seemed to have little difficul- 
ty discriminating between self and others, but 
they did show a differentiation of the situational 
responses as just noted. 











June, 1953) 


TABLE 0 


INTER-SCORER RELIABILITY— PART I (PHASE I) 





Scorers 


Number of 
Responses Percent 





1. Agreement between six scorers 


Acceptance or rejection 
Acceptance or rejection with 
ambivalence score 
Disagreement 
Total 


. Agreement on interval score between 
six scorers 


All six 

Five out of six 

Four out of six 

Three out of six 

Two out of six 
Total 


. Range of interval scores Between 
six scorers 








304 JOURNAL OF EXPERIMENTAL EDUCATION 


The response ‘‘,...I ignore it if courteously 
done’’ is an illustration of the disagreement be- 
tween acceptance and rejection. Four of the scor- 
ers gave it a score of 2, indicating a rejection of 
the situation, while one gave it a score of 6, as- 
suming that the response showed ego strength or 
self-acceptance. One judged it to represent a 
rejection of others and scored it as 1, 

In spite of the shadings of interpretation pos- 
sible in each response, the reliability of scoring 
seemed to be rather high. Each scorer indicated 
that he had some difficulty evaluating the response 
according to the scoring principles only. There 
was a frequent temptation to interpret the re- 
sponse dynamically for latent clinical content. 
Some also expressed the feeling that personal bi- 
as entered into their scoring. In spite of these 
factors, the inter-scorer reliability is sufficiently 
high to justify the scoring system. 

The second phase of establishing the reliabil- 
ity of the scoring system consisted of correlating 
the total scores of the investigator and two other 
scorers for ten randomly selected tests among the 
validation group, Schools Il and III. After the in- 
vestigator had scored all of the 83 tests blind for 
the teachers at Schools II and II], he submitted 
every eighth test to two professors of educational 
psychology, both Fellows in the Division of Clin- 
ical and Abnormal Psychology of the American 
Psychological Association. They were given a 
copy of the scoring principles and list of scoring 
examples to assist them in scoring the ten tests. 
It will be noted from Table III that the rank order 
correlations between the investigator and the two 
scorers, A and B, indicate that the SCT can be 
scored reliably from scoring examples. 

Phases I and II for establishing inter-scorer 
reliability would indicate that the investigator had 
established a consistent and understandable sys- 
tem of scoring the SCT. Phase I of the reliabil- 
ity procedure was considered to be a most signif- 
icant one. (1) The five scorers had available only 
the general scoring principles. The fact that the 
investigator and the five scorers were in agree- 
ment on the difference between acceptance and re- 
jection would show that the dimension of person- 
ality under investigation could be recognized. (2) 
Their ability to distinguish between different lev- 
els of acceptance and rejection as indicated by 
their general agreement on interval scores would 
seem to show that ego distance was a measurable 
quantity. (3) The six scorers for Phase I were 
not considered to be atypical of those now respon- 
sible for teacher trainee and employee selection. 
(4) The inter-scorer agreement would tend to 
give a high estimate of internal consistency for 
the SCT. 

Phase II of the reliability procedure would 
seem to indicate that the total score for the SCT 
could be used in the determination of a cut-off 
score for selection purposes. 





(Vol. XXI 


SECTION VII 


CRITERION MEASURES—FINDINGS 
AND CONCLUSIONS 


THIS SECTION will present the findings 
and conclusions revealed in the analysis of the 
criterion data used in this investigation. Section 
VIII will offer the findings from the predictive 
measures. 

The investigator was primarily concerned 
with the influence of a dimension of personality 
upon teacher effectiveness. It was, therefore, 
necessary to establish reliability of both criter- 
ion and predictive measures before any valid 
conclusions could be drawn from this study. 

The criterion measures used in this investi- 
gation served a primary purpose of correlating 
the SCT against a measure of teacher effective- 
ness. A secondary purpose of the criterion meas- 
ures was to study some of the factors involved in 
sampling procedures. 

This section will present the findings in three 
parts: (1) reliability of the criterion measures, 
(2) sampling differences between participating 
and non-participating teachers, and (3) miscel- 
laneous findings. 


I. Reliability of Criterion Measures 





The criterion measures consisted of the stud- 
ent, administrator, and self evaluations on three 
scales, A, B, and C. Scale A, teacher effec- 
tiveness, was considered to be the most useful 
measure for purposes of this investigation, as 
the investigator was primarily concerned with 
the relationship between the teacher’s attitude of 
acceptance and his effectiveness as a teacher. 

A global evaluation of effectiveness on an unstruc- 
tured and non-itemized scale was believed to be 
the best procedure to sample the raters’ feelings 
of the teachers’ effectiveness. 

The student ratings were the only ones for 
which reliability could be established, and only 
Scale A was used for this purpose. There was 
no reason to believe that the reliability for Scales 
B and C would be different. School II was used 
for convenience; and again, there was no reason 
to believe that the students at School III would 
differ significantly from those at Schools Iand 
I. 

Reliability was determined as follows: Pairs 
of mean ratings on the 32 teachers at School I 
were obtained for two groups of randomly select- 
ed students. The reliability coefficient was .75. 
When this correlation was corrected by the Spear- 
man-Brown formula for a typical class of thirty 
students, a correlation of . 88 was obtained, which 
would indicate that student evaluations were are- 
liable criterion measure to use in this investiga- 
tion. 











June, 1953) 


Scales B and C were used to determine if fac- 
tors measured in those two scales had an influ- 
ence on teacher effectiveness as measured by 
Scale A. There was also the possibility that 
Scales B and C might correlate higher than Scale 
A with acceptance as measured by the SCT. Sec- 
tion IV raised the question of whether the three 
scales differed fundamentally in their purpose or 
rationale. Although it was not the purpose of this 
investigator to answer categorically the question 
of what constitutes the effective teacher, it was 
thought that Scales B and C might conceivably 
conclude some elements which were contributing 
to our understanding of the effective teacher. 

The relationship between Scales A, B, and C 
for all participating teachers as evaluated by the 
students can be seen from the following product- 
moment correlations: 


Scales A and B . 66 
Scales A and C owe 
Scales B and C . 76 


The following conclusions may be drawn from 
these inter-correlations: 


1. All three scales are to some extent meas- 
uring the same or similar factor or factors. 

2. The effective teacher, according to the 
students, tends to be the one who accepts the stu- 
dents and trusts them (Scale B), and who also 
goes about his teaching with a minimum of strain 
and effort (Scale C). 

3. Some of the factors which the students used 
to evaluate their teachers’ effectiveness were 
also used in evaluating their teachers’ attitudes 
toward them and their teachers’ ease of teaching 
and sense of humor. 

4. A response-set established in Scale A 
caused the students to evaluate their teachers in- 
discriminately on Scales B and C. 


The reliability of the criterion scales was al- 
so studied with reference to inter-rater consis- 
tency at Schools IJ and II. 

It can be seen from the product-moment cor- 
relations in Table IV that there was little consis- 
tency between the three classes of raters. Each 
group of raters was apparently evaluating the 
teachers according to a different standard. The 
students and administrators showed a closer re- 
lationship than did the teachers’ self evaluations 
with either students or administrators. In fact, 
the students and administrators at School III show- 


ed a near significant reliability on Scales B and 
Cc 


These data would indicate that composite rat- 
ings were possible between students and admin- 
istrators, but that self evaluations were not prac- 
tical. The evidence from Table IV would further 
indicate that the investigator would be forced to 





REED 305 


accept the criterion evaluations of one group of 
raters only. Because the student ratings were 
found to be a reliable measure, and also because 
the scoring system for the SCT was standardized 
on the student evaluations of teacher effective- 
ness, it would seem that for purposes of this in- 
vestigation student ratings were the only reliable 
measure to be used. 


Il. Validity of Predictive Measure 





A preliminary evaluation of the validity of the 
SCT as a predictive measure of teacher effect- 
iveness may be helpful at this point in order to 
clarify the design used in this investigation. 


1. The literature has revealed a growing con- 
viction that behavior in specific situations may 
be predicted from projected verbal attitudes and 
feelings as expressed in response to semi-struc- 
tured stimulus words and phases. 

2. The rationale for this study consisted of a 
belief that the effective person felt secure and 
unthreatened. The students at Schools I and II 
and the administrator at School III consistently 
rated the participating teachers higher than the 
non-participating teachers. The administrators 
at Schools I and II reversed the trend noted above. 
(The non-participating teachers at School II] were 
rated by the administrator only. ) 

This same tendency of the lower rated teach- 
ers, as evaluated by the students and one admin- 
istrator, to reject a threatening situation wasal- 
so noted in the tendency of the lower rated teach- 
ers to reject the evaluations of themselves. A 
significant number of the 33 who did not rate them- 
selves, but who did complete the SCT, were rat- 
ed below the mean by the students. It would, 
therefore, seem to indicate that the ineffective 
teacher from the students’ viewpoint is also the 
one who could not accept the threatening situa- 
tions involved in this study. It was also found 
that a significant difference existed between the 
mean scores on the SCT of the 71 teachers who 
evaluated themselves and the 33 who did not. 

3. The remainder of this study (Section VIII) 
is devoted to establishing validity for the SCT by 
product-moment correlations between the SCT 
and the various criterion measures. 


Ill. Sampling Differences Between Participating 
and Non-Participating Teachers - 








The reliability of the sample used in this study 
can be approached from an analysis of the differ - 
ences between the teachers who were evaluated 
only and those who were evaluated and also par- 
ticipated in the standardization of the predictive 
measure. This investigator was eager to know 
if the criterion measures would reflect the influ- 
ence of the personality dimension under investi- 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE Il 


INTER-SCORER RELIABILITY — PARTS I AND Il OF SCT 
(PHASE II) 





Scorers Correlations 





Part I 


Investigator and A 
Investigator and B 
A and B 


Part II 
Investigator and A 


Invesitgator and B 
A and B 





TABLE IV 


INTER-RATER RELIABILITY—CRITERION MEASURES 





Scales School I School III 





Scale A 


Students-Administrators 
Students -Self 
Administrators -Self 


Scale B 
Students- Administrators 
Students -Self 
Administrators -Self 
Scale C 
Students- Administrators 


Students -Self 
Administrators -Self 














June, 1953) 


gation, namely, acceptance. It was recognized 
that participation in this project constituted a 
challenge to the personality structure of the tea- 
cher. The design of the study provided evalua- 
tions on all classroom teachers at Schools I and 
II by students and administrators. At School II, 
all classroom teachers were rated by the admin- 
istrator, and the students rated only the partici- 
pating teachers. If any significant differences 
were noted on the criterion scales, it could con- 
ceivably mean that the accepting teacher could 
also accept this threatening assignment. It would 
also mean that sampling procedures in any study 
of this type would have to recognize this possible 
source of error. 

It will be noted from Table V that the students 
at Schools I and II and the administrator atSchool 
Ill consistently rated the participating teachers 
higher than they did the non-participating teach- 
ers. There were, however, only two significant 
differences, Scales B and C at School lI. Scale 
C reflected the trend more than the other two 
scales. Apparently those teachers who goabout 
their teaching with the least effort were also most 
cooperative in this investigation. 

The administrators at Schools I and reversed 
the trend noted in the student evaluations at those 
schools in that the administrators were almostas 
consistent in evaluating the participating teachers 
lower. None of the differences was found to be 
significant, but the trend is evident. 

It would, therefore, seem evident that the ef- 
fective teacher had a tendency to be less threat- 
ened by the project used in this study and was 
more willing to accept the inconvenience and pos- 
sible exposure. It is also indicated that any in- 
vestigator should be cautious in assuming that 
voluntary participants are representative of the 
whole population. 

A comparison of the mean criterion scores 
for the 71 participating teachers who completed 
the self-evaluations and the 33 who participated 
but did not evaluate themselves shows the same 
trend in Table VI as noted in Tables V and VIII. 

The 71 participating teachers who completed 
the self evaluations were rated higher by the stud- 
ents on all three scales than the 33 who did not 
evaluate themselves. The administrators, how- 
ever, reversed the students’ trend by evaluating 
higher the 33 participating teachers who did not 
evaluate themselves. 

The students and administrators did not know 
at the time they evaluated the teachers which tea- 
chers were participating, but they were both able 
to distinguish the two groups of teachers. How- 
ever, the standards by which they rated the tea- 
chers apparently differed. 

If the accepting person is one who can accept 
a threatening situation such as a self evaluation, 
and the effective teacher is an accepting person, 
then obviously the students are better able to se- 





REED 307 


lect the accepting person and the effective teach- 
er than the administrators. 

The sex of the teachers was apparently not a 
contributing factor to whether the teacher partic~- 
ipated or not. An equal number of male and fe- 
male teachers volunteered at School III. AtSchool 
I, 50 percent of the male teachers and 40 percent 
of the female teachers volunteered. At Schoolll, 
the female teachers volunteered in larger num- 
bers than did the male teachers, 64.5 and 61 per- 
cent respectively. 


IV. Miscellaneous Findings 





An examination of the mean scores from Table 
VII indicates that the evaluations of the teachers 
were consistently above the middle score of 5on 
the 9-point scales. These above-average scores 
reveal a negative skewness as typically found in 
distributions of ratings. The distributions on 
Scale A were generally less skewed than those 
on Scales B and C. 

The consistency of the criterion measures be- 
tween the different raters and the different 
schools can be noted from Table VIII. 

Section A of Table VIII indicates that all ex- 
cept one of the critical ratios between the stud- 
ents’ and administrators’ ratings are significant 
at the 5 percent level or better. The students 
consistently rated the teachers higher than the 
administrators did. Only at School I on ScaleA 
was the difference not significant. 

An examination of Sections B and C of Table 
VIII shows that Schools II and Ill differed signif- 
icantly on all of the student measures and on two 
of the three administrator measures. Insignifi- 
cant differences are noted between schools I and 
II on the student measures, and on Scale B only 
are the administrators’ measures significantly 
different. Schools I and III show approximately 
the same relationship. 

Some explanation for this trend can be gained 
from Table IX. 

The students at School III rated their teachers 
higher on all three scales than did the students 
at either of the other two schools. There is al- 
so a consistency to be noted in the administrators’ 
ratings. The ratings at School I were highest, 
with School [II next and School I lowest. 

It would therefore seem that the criterion 
measures used in this investigation were consis- 
tent within each school as compared with each 
other. This fact may indicate that the criterion 
samples have a certain point of reference which 
is peculiar at each school. One cannot interpret 
these findings to mean that the teachers are bet- 
ter at one school than they are at another. How- 
ever, it can be safely concluded that according 
to the rank order of mean scores ‘the students 
and administrators maintained their relative po- 
sitions on all three scales and thus demonstrated 





JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXI 


TABLE V 


RELIABILITY OF THE DIFFERENCE BETWEEN CRITERION MEASURES FOR PARTIC- 
IPATING AND NON-PARTICIPATING TEACHERS 





Critical Ratios* 





Schools Scale A Scale B Scale C 


School I 





Students (+)** .79 (+) 2. 62*** 
Administrators (-) 1.22 . 00 


School II 


Students 
Administrtors 


School III 


Administrators ‘ : (+) 1.91 








* The critical ratios in this table and subsequent ones were computed from the data con- 


tained in Table IX. 

** The (+) sign indicates that the participating teachers, were evaluated higher than the 
non-participating teachers, as noted in Table IX. The (-) sign indicates that the par- 
ticipating teachers were evaluated lower. 

***Significant at . 01 level. 


TABLE VI 


COMPARISON OF MEAN SCORES FOR TEACHERS WHO DID AND DID NOT 
COMPLETE THE SELF EVALUATIONS 





(N = 71) (N = 33) 


Teachers Not 
Teachers Evalu- Evaluating 
Mean Scores ating Themselves Themselves 





Students’ Mean Scores 


Scale A 
Scale B 
Scale C 


Administrators’ Mean Scores 
Scale A 


Scale B 
Scale C 








TABLE VII 


RESULTS OF CRITERION MEASURES 





a 


Schools N Scale A Scale B Scale C 0 


School I 





Students’ Evaluations 
Participating Teachers 
Non-Participating Teachers 

Difference 


Administrators’ Evaluations 
Participating Teachers 
Non-Participating Teachers 

Difference 











Self Evaluations 
School I 


Students’ Evaluations 
Participating Teachers 
Non-Participating Teachers 

Difference 


Administrators’ Evaluations 
Participating Teachers 
Non-Participating Teachers 

Difference 


Self Evaluations 


School II 


Students’ Evaluations 
Participating Teachers 


Administrators’ Evaluations 
Participating Teachers 
Non-Participating Teachers 

Difference 


Self Evaluations 











JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE VIII 


RELIABILITY OF THE DIFFERENCE BETWEEN THE CRITERION 
MEASURES FOR PARTICIPATING TEACHERS 





Critical Ratios 
Scale A Scale B Scale C 








. Students and Administrators 
School I .2* 4. 9* 
School 0 . 3.3* 
School Il ; . 1* 10, 8* 


. Students’ Evaluations and 
Different Schools 
Schools I and II 
Schools II and III 
Schools I and III 


. Administrators’ Evaluations 
and Different Schools 
Schools I and II 
Schools II and III 
Schools I and III 








* Significant at . 01 level. 
**Significant at . 05 level. 


‘ 
3 


TABLE IX 


RANK ORDER OF SCHOOLS ON EVALUATION SCALES 








Scale B 





Students 


pat 
I 
I 


Administrators 
I 


mi 
I 














June, 1953) 


some consistent response set or point of refer- 
ence. 

A related aspect of the differences bet ween 
raters is the possible influence of grade level and 
sex of the students. The implication has been 
made in the literature that students are not ma- 
ture enough to evaluate their teachers realistic- 
ally, and further that the younger students are 
less able to rate. This investigator has noted 
that few studies have attempted to determine the 
differences, or sets, between grade levels. 

From Table X it can be seen that the teachers 
of all three grades at School II were rated high- 
est by the tenth graders on all three scales. With 
the exception of Scale A, they were also rated 
higher by the twelfth graders than they were by 
the eleventh graders. Inasmuch as the same 
teacher was evaluated by all three grades, one 
of the contributing variables would have to be the 
students’ grade level. It is possible that the 29 
teachers were more effective with twelfth grade 
students, but it is not probable that that would be 
the most significant variable in this instance. 

The only significant difference between the 
three grades was that between the tenth and 
twelfth graders on Scale A. There was anear 
significant difference between the tenth and elev- 
enth graders. Had the twelfth graders rated the 
teachers lower than the eleventh graders, as they 
did on Scale A, there might have been an import- 
ant difference between Scales B and C. 

While there is a tendency for the tenth graders 
to rate higher, one cannot conclude that they are 
less right. Neither can one conclude that, be - 
cause the ratings of the eleventh and twelfth grad- 
ers were more like those of the administrators, 
the older students were more right. 

The reliability of the differences between male 
and female students in their evaluations of their 
teachers showed no significant differences on 
Scales A and C, as noted in Table XI. However, 
on Scale B the female students evaluated their 
teachers much higher than did the males—criti- 
cal ratio, 3.41. Apparently the female students 
feel that their teachers respect and trust them 
more than do the male students. 

The investigator endeavored to determine the 
feasibility of using self evaluations as criterion 
measures. Any conclusions to be drawn from 
the findings will necessarily be weakened by the 
fact that only 71 of the 104 participating teachers 
completed the criterion measures. 

Table XII reveals that the students generally 
rated the teachers higher and the administrators 
rated them lower than the teachers rated them- 
selves. The teachers at School III rated them- 
selves higher on Scales B and C than did the tea- 
chers at Schools I and II, but they also rated 
themselves lower on Scale A. The only signifi- 
cant difference between the student and the teach- 
er ratings was on Scale A at School III, where the 





REED 311 


’ 

students rated the teachers higher and the teach- 
ers rated themselves lower than at either School 
lor School Il. There were significant differences 
on Scales B and C between the administrator and 
self evaluations at all three schools, where the 
teachers consistently rated themselves higher. 


V. Summary 


1. Student ratings of teacher effectiveness 
were found to be reliable, with a correlation of 
.88 based on a class of 30 raters. 

2. Inter-scale correlations of student ratings 
showed that the effective teacher tended also to 
be an accepting teacher (correlation between 
Scales A and B, .655). The effective teacher 
was also one who taught with ease and with a 
sense of humor (correlation between Scales A 
and C, .72). The teacher felt to be the most ac- 
cepting by the students was also the one who taught 
with ease (correlation between Scales Band C, 
.76). 

3. No relationship was found to exist between 
the ratings of the students, the administrators, 
and the teachers’ ratings of themselves (Table 
IV). 

4. A positive tendency was noted on all three 
scales for the students to evaluate the participat- 
ing teachers higher than those teachers who did 
not participate in the standardization of the SCT 
(Table V). The administrator ratings at School 
Ill showed the same trend, while the administra- 
tors at Schools I and II showed a tendency to rate 
the non-participating teachers higher on allthree 
scales. 

5. No sex difference was noted on the tendency 
of the teachers to participate in the standardiza- 
tion phase of this study. 

6. All distributions of evaluation scores show- 
ed a negative skewness with distributions on Scale 
A generally showing less skewness (Table VII). 

7. A consistent tendency was noted for the 
students to evaluate the teachers significantly 
higher than did the administrators. The teach- 
ers also rated themselves higher than did the ad- 
ministrators, significantly so on Scales Band 
C (Table VII) 

8. The rank order of mean scores by schools 
for both students and administrators was main- 
tained on all three scales (Table IX). 

9. The tenth grade students showed a tendency 
to rate their teachers higher than did the other 
two grade levels, but only one significant differ - 
ence was noted, on Scale A (Table X). 

10. The male students showed no discrimin- 
ating difference from the female students on Scales 
Aand C. However, on Scale B, the male students 
felt less accepted and trusted than did the female 
students (Table Xi). 

11. It was noted that the participating teach- 
ers were somewhat frustrated and anxious about 





JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXI 


TABLE X 


STUDENT EVALUATIONS OF TEACHERS OF ALL THREE GRADES 
SCHOOL II * 





Grades Scale A Scale B Scale C 





Mean Scores 


10th Grade 
11th Grade 
12th Grade 


Critical Ratios 


Difference between: 
10th and 11th Graders 
11th and 12th Graders 
10th and 12th Graders 





* N = 29 teachers. 
**Significant at . 05 level. 


TABLE XI 


MALE AND FEMALE STUDENTS’ EVALUATIONS OF TEACHERS 
SCHOOL II* 





Scale A Scale B Scale C 





Males—Mean Score 6.3 6.9 7.0 
Females —Mean Score 6.4 7.4 7.0 
Difference ‘ 8 


Critical Ratio 





* N = 61 teachers. 
**Significant at . 01 level. 





June, 1953) 


completing the evaluations of themselves. The 
fact that 33 of the 104 teachers did not rate them- 
selves on all three scales might be interpreted 

to mean that they could not accept themselves or 
the situation. Section VII will show that a sig- 
nificant percentage of the 33 teachers were found 
to be less accepting, according to their scores on 
the SCT. 

12. The self evaluations in this study were 
not considered to be reliable measures because 
of the large number of teachers who declined to 
rate. themselves. 


SECTION VII 


PREDICTIVE MEASURES—FINDINGS 
AND CONCLUSIONS 


THIS SECTION will present the relation- 
ship between the scores on the SCT and the var- 
ious criterion measures. The first part will 
present the product-moment correlations for the 
standardization group, SchoolI. The second 
part will present the correlations between the 
criterion measures and the SCT scores after the 
investigator had scored the tests ‘‘blind.’’ The 
third part will present the correlations of the self 
evaluations with the SCT scores. The fourthpart 
will present the correlations between the SC T 
scores and certain biographical data. 

The purpose of the SCT was to measure the 
subject’s attitude of acceptance, a dimension of 
the personality organization. Section VI described 
the construction, scoring principles, and the re- 
liability of the predictive measure. TheSCT was 
divided into two parts, Part I and Part II, onthe 
basis of scoring technique. It was considered 
that the parts were similar in principle and that 
both were measuring the same aspect of the per- 
sonality organization. 

Part I consisted of 27 incomplete sentences 
which remained from the original 91 items after 
31 had been eliminated as difficult to score, and 
31 had been eliminated by inspection. Part II of 
the SCT contained 13 items from the 31 that had 
been eliminated as difficult to score on the basis 
of the 7-point scale used in scoring PartI. The 
items in Part II were scored as positive or neg- 
ative responses, and the positive responses were 
given a score of 3, equivalent to the mid-point of 
the scoring scale used in Part I. 

After the investigator had standardized the 
scoring system of the SCT on the participating 
teachers at School I, he scored the tests ‘‘blind’’ 
for the teachers at Schools II and II. 

The product-moment correlations between the 
40 SCT items and the students’ Scale A at all 
three schools were slightly improved by submit- 
ting each test item to the chi-square test of sig- 
nificance. Fourteen of the 27 items in Part I 





313 


and 12 of the 13 items in Part II met the .20 level 
of significance or better when the SCT scores of 
the ten highest-rated teachers of the total 104 were 
compared with those of the ten lowest-rated teach- 
ers. The student evaluations on Scale A, teacher 
effectiveness, were used to select the top- and 
bottom-rated teachers. Parts I and II of theSCT 
were combined to form Part I. All of the cor- 
relations cited in the rest of this section will be 
based upon the SCT scores obtained after the re- 
finement of Parts I and II as described above. 
Part I of the SCT will consist of the scores ob- 
tained on the fourteen items scored from 0 to 6; 
Part II will be the scores obtained by multiplying 
the number of positive responses by 3, and Part 
III will be a combination of the scores for Parts 

I and I. 


I. Standardization Group—School I 





The normative data were obtained from the 21 
participating teachers at SchoollI. There wasno 
reason to believe that this sample was inany way 
atypical of the larger sample used in this study. 
The scoring system for Part I was in large part 
devised independently of the ratings, and so it can 
be said that the scoring of the items was not biased 
unduly by the student ratings on Scale A. Part D 
was scored quite deliberately on the basis of the 
student ratings for Scale A and therefore it can 
be said there was some bias in favor of student 
judgment. 

Table XIII presents the correlation coefficients 
for the normative group at School I. All of the 
student evaluations on Scales A and B correlated 
significantly with scores on Parts I, IJ, and II. 
Near significant correlation coefficients were ob- 
tained on Scale C. Only one of the administrator 
evaluations was significantly correlated withSCT 
scores, namely, Part I of the SCT and Scale A. 

It will be noted that the student evaluations 
correlated better with Part I of the SCT than they 
did with Part Il. This was also true for the ad- 
ministrator evaluations on Scale A. However, 
Part II improved the correlations on Scales B 
and C slightly. 

It can be seen that a definite relationship ex- 
ists between the teacher's attitude of acceptance 
as measured by the SCT and the teacher’s effec- 
tiveness as measured on Scale A by the students. 
It can be concluded that a tendency exists for the 
effective teacher, according to pupil judgment, 
to be an accepting person. 

The significant correlation coefficients for 
Scale B and the SCT scores would also indicate 
that the students feel that the accepting person is 
also the teacher who seems to trust them, accept 
them, and have confidence in them. Apparently 
the well-integrated teacher as evidenced by an 
attitude of acceptance demonstrates this attitude 
in a manner which is felt or perceived by the stud- 





JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXI 


TABLE XII 


AN ANALYSIS OF THE DIFFERENCES BETWEEN SELF EVALUATIONS 
AND OTHER CRITERION MEASURES 





School N MeanA MeanB Mean© 





I 21 Student 6.5 
15 Self 6.2 
CR 
15 Self 2 F ee 
21 Administrator .6 ; .6 
CR 


7. 
vs 


2 
2 


aownon oa 


51 Student 

28 Self 
CR 

28 Self 

21 Administrator 
CR 


Nw Of CO 


. 
a 


32 Student 

28 Self 
CR 

28 Self 
Administrator 
CR 


Nr We w= 





* Significant at .01 level. 
**Significant at . 05 level. 
TABLE xi 


PEARSON PRODUCT-MOMENT CORRELATIONS BETWEEN SCT SCORES 
AND CRITERION MEASURES—SCHOOL I* 





Evaluations Scale A Scale B Scale C 





Part I (14 Items) 


Student Evaluations 
Administrator Evaluations 


Part II (12 items) 


Student Evaluations 
Administrator Evaluations 


Part III (26 items) 


Student Evaluations 
Administrator Evaluations 





* N = 21 teachers. 
**Significant at . 05 level or better 





June, 1953) 


ents. 

The correlation coefficients between student 
judgment on Scale C and SCT scores do not indi- 
cate that the accepting person is also the teach- 
er who conducts his class in an easy manner. 
However, there is a definite tendency at School 
I for this to be true, as indicated by the near sig- 
nificant relationship. 

The correlation coefficients for the adminis- 
trators reflect the same tendency as noted above 
in the analysis of the student evaluations. How- 
ever, the positive relationship between the tend- 
ency for the accepting person to be an effective 
teacher was not so marked in the case of the ad- 
ministrators’ ratings as in those of the students. 


Il. Validation Group—Schools II and II 





The SCT was validated on groups of teachers 
which did not include any of the subjects used in 
developing the scoring principles and scoring ex- 
amples. Scoring of the tests was done ‘‘blind’’ 
in that the investigator never knew whether the 
test blank was supposed to be that of an effective 
or ineffective teacher. It was believed that the 
sample of teachers at School II was similar to 
that at School I. There were enough differences 
in sampling procedure at School III to raise some 
doubt about the reliability of the two validation 
groups and so the correlation coefficients were 
computed separately for Parts I and III of the 
SCT. 

Table XIV indicates that significant correla- 
tion coefficients were obtained on both sample 
groups for teacher effectiveness, Scale A, as 
evaluated by the students. The correlation coef- 
ficients of . 454 and . 596 obtained on Parts I and 
Ill for Scale A at School II are insignificantly dif- 
ferent from those of . 625 and . 722 at School III, 
due to the difference in number of teachers in 
each sample. 

Significant correlations on Scale B, with the 
exception of a near significant correlation for 
Part I at School II, would confirm the trend es- 
tablished at School I. The correlation coeffic- 
ients were practically as high for Schools II and 
Ill as they were for School I. 

All the student correlation coefficients on Scale 
C were consistently better than would be expected 
from chance. A comparison of the results on 
Scale C from Table XII and XIV show that 
Schools II and III reversed the trend at School I, 
the normative group. None of the correlations 
at School I was significant, while all were signif- 
icant at Schools II and II. However, the differ- 
ences were negligible considering the number in 
each sample. 

It will also be noted from Table XIV that the 
administrators at Schools II and III were consist- 
ent with those at School I in that they also demon- 
strated positive correlations but generally insig- 





REED 


nificant ones. 

In another test for validity of the SCT it was 
found that the SCT could successfully identify the 
effective teacher from the ineffective teacher. 
The 104 participating teachers were divided into 
two groups on the basis of the average mean rat- 
ing of 6. 44 for the students’ evaluation on Scale 
A. At School I it was found that a cutting score 
of 63 on Part III of the SCT correctly identified 
64 percent of the effective teachers and 86 per- 
cent of the ineffective teachers. At School II the 
cutting score identified 69 percent of the effec- 
tive and 80 percent of the ineffective teachers; 
and at School II, 100 percent of the effective 
teachers and 69 percent of the ineffective teach- 
ers were identified. 


After combining the three schools, it was 
found that the cutting score of 63 identified the 
effective and ineffective teachers equally well, 
78 percent. At Schools II and Ill, the validation 
group, the cutting score correctly identified 82 
percent of the effective teachers and 77 percent 
of the ineffective teachers. 


If the SCT were used as a screening device, 
it would therefore, be able to serve its purposes 
adequately. This validity is more significant 
when it is remembered that the screening of norm- 
als is considered to be more difficult than differ- 
entiating between normal and abnormal personal- 
ities. 


According to the rationale of this investiga- 
tion, the threatened or insecure person has a 
tendency to reject any stimulus whichis not read- 
ily incorporated into his frame of reference or 
phenomenal field. The evaluations of the self 
on the three criterion scales were considered to 
be threatening. Other investigators have noted 
this phenomenon when the subject is asked to 
identify or to describe himself in relation to 
some situation. 


It was noted in Section VII that the teachers 
who rejected the self evaluations were rated low- 
er by the students and one administrator than 
were the teachers who accepted the self evalua- 
tions. The SCT was also able to identify the tea- 
chers who rejected the self-evaluations. The dif- 
ference in mean scores of 4.72 on the SCT be- 
tween the 71 teachers who accepted the self eval- 
uations and the 33 who rejected the self evalua- 
tions was significant at the . 01 level with a crit- 
ical ratio of 2.64. The fact that a difference that 
large could not be expected by chance one time 
out of a hundred would indicate that a significant 
variable was operating. It may be safely as- 
sumed that in this instance the attitude of rejec- 
tion as measured by the SCT was the variable 
responsible for a rejection of the self evaluations. 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE XIV 


PEARSON PRODUCT-MOMENT CORRELATIONS BETWEEN SCT SCORES AND CRITERION 
MEASURES—SCHOOLS I AND III* 





Scale A Scale B Scale C 








Schools Schools Schools 
Evaluations Ul I II Ill Il Ul 


ee eee 





Part I 


Student Evaluations . 45** . 63** .23 
Administrator Evaluations . 07 4 7** 23 


Part 0 
(Schools II and III, combined) 


Student Evaluations 
Administrator Evaluations 





Part I 


Student Evaluations . 60** 
Administrator Evaluations .17 





* School Il, N = 51; School II], N = 32 teachers. 
**Significant at . 05 level or better. 


TABLE XV 


PEARSON PRODUCT-MOMENT CORRELATIONS BETWEEN SCT 
SCORES AND SELF EVALUATIONS—SCHOOLS I, U, Mil 





Parts N Scale A Scale B Scale C 





Part I 71 .18 . 16 .12 


Part Il 71 . 25* .20 . 16 





*Significant at . 05 level. 





June, 1953) 


Ill. Self Evaluations of Participating Teachers 
for Schools I, II and I 








Table XV indicates that a significant correla- 
tion coefficient was obtained between the teach- 
ers’ attitude of acceptance as measured by Part 
Ill of the SCT and their own evaluations of their 
teaching effectiveness, Scale A. This observa- 
tion would mean that there was a tendency for 
those teachers who rated themselves high to al- 
so demonstrate a greater degree of acceptance. 

The results from Table XV must be interpre- 
ted cautiously, as 33 of the 104 participating tea- 
chers did not complete the evaluations of them- 
selves. The significance of this factor can be 
partially understood by determining which teach- 
ers declined to evaluate themselves. Itwas noted 
that 23 percent of the teachers, or 13 out of 57, 
who had the highest scores on Part I of the SCT 
declined while 42.5 percent, or 20 out of 47, of 
those who had the lowest scores declined to eval- 
uate themselves. This difference of 19.5 percent 
would not have occurred by chance one time out 
of twenty. If this trend were projected on the 
self evaluations, it could conceivably have im - 
proved the correlation coefficient on Scale A. 

The fact that a significantly greater number 
of those teachers who demonstrated a rejecting 
attitude on the SCT also rejected this aspect of 
the study would verify the rationale of this inves- 
tigation. It will be recalled that one of the cri- 
terion findings noted in Section VII was the anx- 
iety expressed by the participating teachers when 
they were asked to evaluate themselves. 

The following conclusions can be drawn from 
the analysis of Table XVI and the findings as de- 
scribed in subdivisions I, II, and I: 


1. Even though the validation procedure used 
in this study was a crude one, the consistency of 
results obtained at Schools II and II where the 
SCT tests were scored ‘‘blind’’ would indicate 
that the SCT test is able to discriminate between 
the effective and ineffective teachers. 

2. The relationship between the teacher’s 
measured attitude of acceptance and his teaching 
effectiveness according to the students’ judgment 
correlated higher than would be expected by 
chance. 

3. The mean score of 63 on the SCT, when 
used as a cutting score, was found to identify 
over 75 percent of the effective and ineffective 
teachers. 

4. The teacher’s attitude of acceptance cor- 
related significantly with the students’ judgment 
of the teacher’s attitude of confidence and respect 
for his students. Likewise, there is a significant 
tendency for the accepting teacher tc be relaxed 
in his teaching, according to the opinion of the 
students. 





317 


5. The administrators’ judgment of the teach- 
ers’ effectiveness, attitude of confidence and re- 
spect for his students, and the ease with which 
the teachers went about their job of teaching cor- 
related positively with the teachers’ attitude of 
acceptance, but generally insignificantly. 

6. The correlation coefficients at the three 
schools were sufficiently consistent to indicate: 

a. The scoring system established for the 
SCT is a reliable technique. 

b. The three samples of secondary school 
teachers are not unlike each other and 
are probably representative of the total 
population of active secondary school 
teachers on the personality dimension 
of acceptance. 

7. Those teachers who were the most accept- 
ing also evaluated themselves higher than the 
teachers who demonstrated a rejecting attitude. 
The accepting teachers are better able to accept 
a threatening situation, such as evaluating them- 
selves, than are those less accepting teachers. 

8. Scale A was consistently the one criterion 
measure that correlated best with the measured 
attitude of acceptance. Apparently the teacher’s 
effectiveness is influenced more by an acceptance 
attitude than is the teacher’s observed attitude of 
trust and confidence in the students, or the ease 
with which he goes about his teaching. 

9. It was indicated in Section VII that the sub- 
ject of the stimulus phrase was important in the 
construction of sentence completion items. Of 
the original 91 items used in the SCT, 51 or 46 
percent used the first person, 30 percent were 
in third person masculine or feminine gender, 
and 14 percent were neuter gender or imperson- 
al situations. The final 26 items in Part II of 
the SCT did not show the same proportion; 65 per- 
cent of the first person items, 27 percent of the 
third person, and 8 percent of the situational it- 
ems survived the elimination process. The in- 
crease in the number of first person items inthe 
final form would indicate that when the item is 
structured to elicit a response about the subject, 
it is more discriminating. This observation 
would tend to reinforce the thesis of this study. 


IV. Findings on Biographical Data 





This subdivision will attempt to break down 
some of the criterion and predictive data and an- 
alyze it with reference to sex, marital status, 
dependents, subjects taught, years of teaching 
experience, and age. No attempt was made to 
discriminate between the teachers at the differ- 
ent schools. Separate Pearson product-moment 
correlation coefficients were computed for each 
group to determine the relationship between the 
teacher’s attitude of acceptance and his teaching 
effectiveness. 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE XVI 


SUMMARY OF CORRELATION COEFFICIENTS BETWEEN SCT SCORES 
AND CRITERION MEASURES 





Evaluations Scale A Scale B Scale C 





Part I 


Student Evaluations 
School I 
School II 
School II 


Administrator Evaluations 
School I 
School 0 
School II 


Self Evaluations 
Part I 


Student Evaluations 
School I 
Schools I and II 


Administrator Evaluations 
School I 
School II and I 


Part I 


Student Evaluations 
School I 
School II 
School II 


Administrator Evaluations 
School I 
School II 
School II 


Self Evaluations 





* Significant at . 01 level. 
**Significant at . 05 level. 





June, 1953) 


1. Sex Differences 


The correlation coefficients between the scores 
on the SCT and the student evaluations of their 
effectiveness show no significant differences be- 
tween the male and female teachers. This would 
be expected considering the insignificant differ - 
ence between their mean scores on the SCT and 
their evaluations, provided the students’ Scale 
A were a reliable measure of effectiveness (Sec- 
tion IV showed a reliability coefficient of .88) and 
further that the SCT were a reliable predictive 
measure (the investigator demonstrated the test’s 
reliability in Section V1). Therefore, it may be 
concluded that the coefficients of correlation noted 
in Table XVI are significant ones. 

It will be noted that the administrators evalu- 
ated the female teachers higher than the male 
teachers. The fact that the administrator eval- 
uations of the male teachers correlated higher 
with the SCT than did the evaluations of the fe- 
male teachers may be explained by the difference 
in rating, or the unreliability of the administra- 
tors’ evaluations, or both. It was noted from an 
examination of the correlation charts that a suf- 
ficient number of the teachers whom the admin- 
istrators had scored as effective teachers scored 
low on the SCT so that the correlation coefficient 
of .059 for the female teachers showed no rela- 
tionship. This trend was also carried over to 
Part III on the SCT. 

No significant differences between the male 
and female teachers are to be noted on the self 
evaluations and the SCT scores. The female 
teachers’ correlations were higher than the ad- 
ministrators’. The reverse was true of the male 
teachers. This trend can possibly be accounted 
for by the fact that the administrator and self 
evaluations for the male and female teachers were 
reversed. 

The same trend noted in subdivision II regard- 
ing the tendency of the rejecting teacher on the 
SCT to reject the self evaluations was apparent 
in this analysis of sex differences. A higher 
percentage of both males and females who scored 
toward the rejection end of the SCT scale also 
rejected the self evaluations. The difference 
among the males was not significant; but the dif- 
ference in percent, 28.3, for the females was 
significant at the .05 level. Apparently the self 
evaluation project was more threatening to the 
female teachers. 

It can be concluded that both sexes are equally 
accepting persons as measured on the SCT. The 
SCT scores correlated much better with the stud- 
ent evaluations of teacher effectiveness than they 
did with the administrator or self evaluations. 
The correlations with adniinistrator ratings were 
significantly higher for the male teachers, and 
this trend is probably due to a tendency for the 





REED 319 


administrator evaluations to be less reliable for 
the female teachers. The same factor of unreli- 
ability is evident in the self evaluations, plus 

the influence of the 33 teachers who did not eval- 
uate themselves. . 


2. Marital Status 


The student evaluations of the participating 
teachers show no significant differences between 
married and single teachers in the correlations 
with the SCT scores on acceptance, as noted in 
Table XVII. The administrator and self evalu- 
ations, however, show significantly higher cor- 
relations with the SCT for the married teachers. 
Apparently the administrators were better able 
to evaluate the married teacher’s effectiveness 
as it relates to acceptance than they were to eval- 
uate the single teacher’s effectiveness. In fact, 
the single teachers’ effectiveness shows no rela- 
tionship to acceptance, according to the adminis- 
trators and the teachers themselves. 

If the students’ evaluations show significant 
correlations with the teachers’ measured attitude 
of acceptance for both groups of teachers, and 
the administrator and self evaluations for mar- 
ried teachers are significant or near significant, 
it can mean only that the evaluations of the single 
teachers are unreliable or differ in some signif- 
icant way. The evidence on self evaluations may 
also indicate that the single teachers are less in- 
sightful concerning the problem under question 
in this investigation. It is also noted in Table 
XVIII that the single teachers evaluated them- 
selves significantly lower than did the marr ied 
teachers. 

The possible unreality of the single teachers 
may be explained by assuming that the tendency 
of the single teachers to reject the self evalua- 
tion phase of the project was present also in those 
who accepted the self evaluations. None of the 
15 single teachers who scored toward the accept- 
ance side of the SCT failed to complete the self 
evaluations. However, 9 of the 16 teachers on 
the rejecting side of the SCT also rejected the 
self evaluations. Obviously this difference in 
percent between 0 and 56 would be a phenomenon 
that could not happen by chance short of infinity. 
Of the single teachers, 66 2/3 percent were fe- 
males; of the married teachers, only 44 percent. 
The fact that the female teachers were more re- 
jecting of the self evaluations may be responsible 
for the fact that the single teachers were more 
rejecting of the self evaluations. 

Not only did the students and the teachers 
themselves rate the single teachers lower, but 
the single teachers scored significantly lower on 
the SCT. The difference in mean scores of 4.27 
was significant at the .01 level, indicating that 
the single teachers were more rejecting. The 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE XVII 
SEX DIFFERENCES 





Pearson Product-Moment 
Correlations 





Part I of SCT and 
Student Evaluations 
Administrator Evaluations 
Self Evaluations 


Part Il of SCT and 
Student Evaluations 
Administrator Evaluations 
Self Evaluations 


Mean Scores 
SCT: PartlI 
Part Il 


Evaluations: Student 


Administrator 
Self 





*Significant at . 05 level or better. 


TABLE XVIII 


MARITAL STATUS 





Pearson Product-Moment Scale A 
Correlations Married N 





Part I of SCT and 
Student Evaluations . 54* 31 
Administrator Evaluations .37* 31 
Self Evaluations .18 22 


Part I of SCT and 
Student Evaluations . 65* 
Administrator Evaluations . 334 
Self Evaluations . 34* 


Mean Scores 
SCT: PartI 
Part I 


Evaluations: Student 
Administrator 
Self 





*Significant at . 05 level or better. 





June, 1953) 


TABLE XIx 


NUMBER OF DEPENDENTS 





Scale A 


Pearson Product-Moment Less than 2 2 or More 
Correlations Dependents N Dependents 


Part I of SCT and 
Student Evaluations . 64* 40 .41* 
Administrator Evaluations 11 40 . 44* 
Self Evaluations .18 30 .01 





Part III of SCT and 
Student Evaluations . 60* 40 
Administrator Evaluations .20 40 
Self Evaluations 31 30 


Mean Scores 
SCT: PartlI q 40 
Part 0 . 40 


Evaluations: Student : 40 
Administrator t 40 
Self , 30 





*Significant at . 05 level or better. 


TABLE XX 


SUBJECTS TAUGHT 





Scale A 
Pearson Product-Moment 
Correlations Academic Academic 
Part I of SCT and 
Student Evaluations . 58* 


Administrator Evaluations .24 
Self Evaluations .27 





Part I of SCT and 
Student Evaluations . 63* 
Administrator Evaluations . 28* 
Self Evaluations . 40* 


Mean Scores 
SCT: PartlI 
Part Il 


Evaluations: Student 
Administrator 
Self 





*Significant at . 05 level or better. 





322 JOURNAL OF EXPERIMENTAL EDUCATION 


fact that the administrators rated the single tea- 
chers higher would probably account for the neg- 
ative correlation of -. 120 between Part I of the 
SCT and the administrators’ evaluations of teach- 
ing effectiveness. 


3. Number of Dependents 


Some of the students in the field of personnel 
problems have called attention to the implication 
that persons with dependents and financial obliga- 
tions are more stable and hence better employ- 
ment risks. Table XIX reveals certain data on 
the teacher sample used in this study. 

An examination of the correlation coefficients 
in Table XIX would indicate that those teachers 
with fewer dependents show a better relationship 
between their classroom effectiveness, accord- 
ing to the students, and their attitude of accept- 
ance than do those with two or more dependents. 
The administrator ratings would indicate the op- 
posite. Inasmuch as the mean scores ontheSCT 
show no significant difference between the two 
groups, the other variable, evaluation of effect- 
iveness, is likely to be responsible. The differ- 
ence in mean scores, .11, was not significant, 
and therefore the administrators’ basis of dis- 
crimination between the two groups was apparent- 
ly different from that of the students. 

It cannot be concluded that either group is more 
accepting than the other, as evidenced by the 
mean scores on the SCT, nor is one group signif- 
icantly more effective. However, the students 
and the teachers themselves show a higher cor- 
relation for those teachers with fewer dependents, 
while the administrators provide significant cor- 


relations for those with more than two dependents. 


A larger percentage of those with fewer de- 
pendents rejected the self evaluations than did 
those with more dependents, 35.5 percent and 
29.5 percent respectively. Of those teachers 
with fewer than two dependents, a significantly 
greater percentage of the rejecting teachers on 
the SCT rejected the evaluations than did the ac- 
cepting teachers. The teachers with twoor more 
dependents showed only a chance difference in 
percentage of rejection of the self evaluations. 


4. Subjects Taught 


An inspection of the differences between aca- 
demic and non-academic instructors from Table 
XX reveals no significant differences between 
the two groups on evaluations. However, onthe 
SCT the academic teachers scored significantly 
higher than the non-academic, where the differ - 
ence between the mean scores on Part III of 1.54 
was significant at the .01 level. This would 
clearly indicate that the academic instructors 
were more accepting than the non-academic in- 
structors. This observation on evaluations and 





(Vol. XXI 


acceptance is contrary to a general opinion that 
the students have a tendency to think better of 
the non-academic teachers. 

The correlations between the two measures 
favor the non-academic teachers slightly. The 
self evaluations, however, definitely show a high- 
er correlation for the academic teachers witha 
significant correlation on Part III of the SCT and 
a near significant correlation for Part I. 

No difference was noted between the ‘‘accept- 
ing’’ and ‘‘rejecting’’ non-academic instructors 
with reference to the tendency to reject the self 
evaluations. A near significant difference was 
noted for the academic instructors, where 25 per- 
cent of the ‘‘accepting’’ teachers did not evalu - 
ate themselves and 42.8 percent of the ‘‘reject- 
ing’’ teachers did not evaluate themselves. 


5. Age of Teachers 


An examination was made of the relationship 
between the age of the teachers and the ratings, 
and between age and scores on the SCT, Part I. 
Table XXI indicates a significant difference be- 
tween the correlation coefficients of the evalua- 
tions of the teachers and the evaluations by the 
teachers, and the teachers’ scores on the SCT. 

It would seem that the significant and positive 
correlation of . 245 between age and the teachers’ 
opinions of their own teaching effectiveness would 
show that the older the teachers are the better 
teachers they feel they are. This would appear 
to be unrealistic according to the negative and 
near significant correlation for the students. The 
administrators, however, saw no connection ap- 
parently between age and teaching effectiveness. 

The negative and significant correlation of 
~.256 between age and the SCT score shows that 
the older teachers are more rejecting in their 
attitude. One would reason that the rejecting 
personality demonstrates some personality dis- 
organization and therefore he is apt to be unreal- 
istic; if this reasoning is valid, the students are 
probably more accurate in their evaluation of 
their teachers than the teachers are in their eval- 
uations of themselves. 

A further check on the reliability of the teach- 
ers’ evaluation of their teaching effectiveness 
was possible from an analysis of the rejection of 
the self evaluations. The older teachers tended 
to reject the self evaluations more than the young- 
er teachers did, but the difference of 13 percent 
was not significant. 

It carn be concluded that (1) the older teachers 
are more rejecting in their attitudes on the SCT 
(-.256); (2) the older teachers felt that they were 
more effective teachers (. 245); (3) the students 
felt that the older teachers were less effective; 
(4) there was no significant difference inthe per- 
centage of rejection of the self evaluations be- 
tween the older and younger teachers; and (5) the 





June, 1953) 


TABLE XxXI 


AGE OF TEACHERS 








Pearson Product-Moment 
Correlations Scale A 





Age and 
Student Evaluations .19 
Administrator Evaluations . 07 
Self Evaluations . 25** 


Age and 
SCT: Part Il . 26** 


Mean Scores 


Age 38.5 Years (Md = 35. 8) 
SCT: Part Il 63.4 





* N = 102 teachers. 
**Significant at . 05 level. 


TABLE Xxil 


YEARS OF TEACHING EXPERIENCE* 








Pearson Product-Moment 
Correlations Scale A 





Years of teaching and 
Student Evaluations .12 
Administrator Evaluations . 01 
Self Evaluations . 22** 


Age and 
SCT: Part .14 


Mean Scores 
Years of teaching experience 13.6 (Md = 10.5) 
SCT: Part Hl 





* N = 102 teachers. 
**Significant at . 05 level. 





324 JOURNAL OF EXPERIMENTAL EDUCATION 


rejecting person is more apt to be irrealistic. 
Therefore, this investigator is inclined to dis- 
credit the significant correlation between the tea- 
chers’ concept of their own teaching effectiveness 
and their attitude of acceptance. 

A test was made with the correlation-ratio 
technique for a linear relationship between age 
and teaching effectiveness as rated by the stud- 
ents. It was found that a curvilinear relationship 
did exist, but it was not significant; both vari- 
ables were significant at less than the . 50 level. 
The student evaluation variable was only slightly 
more valuable than the age variable, .32 and .3 
respectively. 


6. Years of Teaching Experience 


It will be noted from Table XXII that the data 
on experience are approximately the same as on 
age, which could be expected. The same con- 
clusions ure also applicable here. The only sig- 
nificant factor to be noted with reference to ex- 
perience is the fact that the regression line was 
more linear than it was for age. The experience 
variable was more valuable than the students’ 
evaluation of effectiveness variable, .24and .174 
respectively. 


SECTION Ix 
SUMMARY AND CONCLUSIONS 


THIS SECTION will present a summary 
of the investigation, conclusions, limitations of 
the findings, and implications for further 
research. 


I. Summary 


From the investigator’s several years of ex- 
perience as a public school teacher, as a super- 
visor of Air Force instructors and public school 
cadet teachers, as a personal counselor, as one 
given to introspection into his own behavior, as 
an interested observer of human behavior, and as 
a fortunate and grateful student of many leaders 
in the fields of education and psychology, the con- 
cept of acceptance has emerged in his thinking as 
a meaningful dimension of the optimum personal- 
ity organization. It was desired to discover in 
‘ this investigation any significant relationship that 
might exist between the subject’s measured atti- 
tude of acceptance and his effectiveness as a sec- 
ondary school teacher. It was further hoped that 
the effectiveness of the teacher as measured on 
Scale A could be better understood through an anal- 
ysis of the relationship between Scale A and the 
teacher’s acceptance of the student as measured 
by the raters on Scale B. The ease with which 








(Vol. XXI 


the teacher went about his teaching as measured 
on Scale C was also thought to be a contributing 
factor to teaching efficiency. 

The investigator obtained the cooperation of 
104 teachers in three secondary schools, and he 
obtained ratings on these 104 teachers and 56 ad- 
ditional non-participating teachers from their ad- 
ministrators and from an average of 62.39 stud- 
ents per teacher. The criterion evaluations were 
correlated against the 104 participating teachers’ 
measured attitude of acceptance on the SCT. 


II. Conclusions from Findings 





Some answers to the problems raised in Sec- 
tion I can be given from an interpretation of the 
findings. 


1. A relationship far beyond chance expectancy 
was found to exist between the teacher’s effective- 
ness in the classroom as evaluated by the stud- 
ents and that aspect of the teacher’s personality 
organization, or attitude, which permits him to 
be an accepting person (Tables XIII - XVI). Judg- 
ing only from the sample of participating teach- 
ers used in this investigation, it would be possi- 
ble to predict with a fair degree of accuracy from 
the SCT scores the teacher whom the students 
would feel to be the more effective teacher. It 
would not be possible to predict safely which tea- 
cher the administrator would judge to be effec- 
tive. 

If the students learn best from the effective 
teacher as defined in this investigation, the stud- 
ents will also learn best from the accepting tea- 
cher as defined and measured by the SCT. The 
effective teacher is also the teacher whom the 
students feel trusts them and has confidence in 
them, and who also seems to teach with ease and 
a sense of humor. (Inter-scale correlations 
ranged from . 66 to . 76.) 

It was found that the SCT could correctly ident- 
ify the effective and ineffective teachers as rated 
by the students in better than 75 percent of the 
cases. 

2. It was found that the SCT was a reliable 
measure of the teacher’s attitude of acceptance. 
Two other scorers agreed with the investigator’s 
scoring of ten randomly selected tests. The inter- 
scorer reliability for the two parts of the SC T 
ranged from . 85 to .95 (Table III). Five other 
scorers and the investigator were able to differ- 
entiate between an acceptance and a rejection re- 
sponse in 391 sample responses out of 445, or 
87.9 percent. In only 12.1 percent of the respon- 
ses did one or more of the scorers disagree with 
the others as to whether the response was demon- 
strating acceptance or rejection (Table I). 

The self evaluations were found to be a signif- 
icant measure of the attitude of acceptance. Of 
the 33 teachers who declined to evaluate the m - 





June, 1953) 


selves, a significant number were found to dem- 
onstrate a rejection attitude on the SCT. The 
mean score on the SCT for those who evaluated 
themselves was significantly higher than the mean 
score for those who did not evaluate themselves. 

3. It was found that the students’ evaluations 
of their teachers’ effectiveness was areliable 
measure for purposes of this investigation, with 
a reliability coefficient of .88. The critical ra- 
tios between the mean scores of the students at 
the different schools generally showed no signif- 
icant differences (Table VIII). 

4. The coefficients of reliability between the 
ratings of the students, administrators, and par- 
ticipating t hers showed generally a positive 
but an insignificant relationship (Table IV). The 
students and administrators demonstrated a clos- 
er agreement than did the teachers with either 
the students or the administrators. Apparently 
each class of raters was evaluating the teachers 
from a different point of reference. The critical 
ratios showing the reliability of the difference be- 
tween mean evaluation scores (Table VIII), the 
inter-rater reliability (Table IV), and the correl- 
ation coefficients with the SCT (Table XVI) all 
demonstrated the lack of strong agreement be- 
tween the different types of raters. 

The consistency of each class of raters was 
demonstrated in that the rank order of the schools 
for both students and administrators was main- 
tained on all three scales (Table IX). It was al- 
so observed that the students’ mean evaluation 
scores of their teachers were highest and the ad- 
ministrators’ lowest, with the teachers’ self 
evaluations in between (Table VII). 

A positive trend was noted for the students to 
evaluate the participating teachers higher than the 
non-participating teachers (Table VII). The ad- 
ministrators, on the other hand, consistently re- 
versed this trend. The same trend was noted for 
those teachers who declined to evaluate them - 
selves (Table XII). 

The students were, according to paragraphs 
1 to 4 above, better able to evaluate their teach- 
ers’ effectiveness in terms of the acceptance at- 
titude than were the administrators. This factor 
may have been responsible for higher correla- 
tions between the students’ evaluations of effec- 
tiveness and acceptance as measured by theSCT 
than between the administrators’ evaluations and 
acceptance. 

The tenth grade students showed a marked 
trend toward higher ratings and the eleventh grade 
to lower ratings (Table X). The male students 
rated the teachers the same as did the female 
students on Scales A and C; but the boys feel less 
accepted and trusted than the girls, as indicated 
by a significant critical ratio on Scale B (Table 
XI). 

None of the above conclusions on the differ- 
ences noted between raters can be interpreted to 





325 


mean that one class of raters is more able than 
the others. The differences mean only that for 
purposes of this investigation the student evalua- 
tions yielded more meaningful results than the 
other evaluations. 

5. It was not possible to determine the relia- 
bility of the self evaluations, as 33 of the 104 tea- 
chers did not complete all three scales. The self 
evaluations provided meaningful data relative to 
the rationale of the predictive measure, but they 
were of no significant value as criterion meas- 
ures, except that there is reason to believe that 
teachers refusing to evaluate themselves are, on 
the whole, less effective teachers. 

6. The biographical data provided some mean- 
ingful relationships with the acceptance attitude 
and with some of the criterion measures. 


The attitude of acceptance seems tc be ident- 
ical for both male and female teachers (Table 
XVII), and for those with less than two depend- 
ents (Table XIX) and more than one dependent. 

A positive and significant trend was noted for the 
mean scores on the SCT ta be higher for the mar- 
ried teachers (Table XVIII), and for those who 
teach academic subjects (Table XXII). The young- 
er teachers (Table XXI) and those with less than 
the mean number of years of teaching experience 
(Table XXII) tended to score higher on the SC T, 
as evidenced by the negative correlations between 
age and acceptance. 

An analysis of the mean scores on teacher ef- 
fectiveness as evaluated by the different raters 
showed some significant trends. The students’ 
and self evaluations showed no distinction between 
the male and female teachers, but the adminstra- 
tors showed a preference for the male teachers 
(Table XVII). No difference was apparent be - 
tween the raters in the evaluation of the married 
and single teachers, as all three groyp» rated 
the married teachers higher (Table XV’; The 
students and administrators agreed that the tea- 
chers with more than one dependent were the bet- 
ter teachers, but the teachers themselves felt 
otherwise (Table XIX). The students and teach- 
ers rated the academic teachers higher than the 
non-academic teachers, while the administrators 
made no distinction (Table XX). 


It may be safely concluded that this investiga- 
tion is only one of hundreds of similar attempts 
to add some knowledge to our understanding of 
the effective teacher. The educational process 
cannot be improved through the efforts of educa- 
tors, psychologists, and philosophers working 
alone. This study may be an insignificant and in- 
tangible contribution for the common good, but 
it is hoped that this research effort may be as 
helpful to others working in the area of teaching 
effectiveness as the many studies examined have 
been helpful to this investigator. 








MEASURING KNOWLEDGE AND APPLICA- 
TION: AN EXPERIMENTAL 
INVESTIGATION 


DONALD E. SMITH 
MARVIN D. GLOCK 
Cornell University 


APPLICATION OF knowledge, either for 
the solution of an important problem or for the 
understanding which enriches life, is the goal of 
the student. To what extent classroom learning 
can be applied elsewhere, although investigated 
sporadically over the past thirty years, has rec- 
ently been brought into focus. Does possession 
of knowledge imply ability to use that knowledge? 
Empirical evidence suggests a negative answer. 
But the results of experimental investigations are 
equivocal (4, 6, 8, 9, 10, and 11). 

On the assumption that the answer is negative, 
that possession of facts and principles does not 
imply the ability to use them, and on the further 
assumption that current achievement tests do not 
adequately consider this outcome of learning, 
tests have been devised which purport to measure 
it. The authors have found, however, only two 


tests with this objective in the literature: the 
case study tests in human growth and development 
by Horrocks and Troyer (4,5), and a similar in- 
strument concerned with the developmental needs 
of adolescent girls by Sara Ann Brown (2). The 
case study tests seem to be most valuable for pur- 
poses of class discussion. Criticism has been 
directed at the method of scoring which consists 
of various weightings determined by the amount 
of agreement among experts on the correct re- 
sponse. Information on the second instrument, 
that concerning developmental needs, is inade- 
quate for an appraisal. 

It is suggested that, in addition to improving 
the scoring system of achievement tests of appli- 
cation, test items be constructed which require 
methods of problem solving used in actual situa- 
tions. 

The authors attempted to construct an objec- 
tive, short answer test which measured adequately 
the ability to apply facts and principles learned in 
an introductory college-level course in general 
psychology. They wished to obtain more data rel- 
ative to the relationship between knowledge and 
application. A final examination adequately cov- 
ering the material of a one term course in gen- 
eral psychology was constructed in two parts. 





Part I, consisting of eighty multiple-choice ques- 
tions based upon reading selections, was designed 
specifically to measure application of content, 
Two of the selections were written by the authors; 
three were adapted from materials which were 
probably unfamiliar to the students. From ten 

to thirty questions immediately follow each selec- 
tion. Part Il, consisting of seventy-nine questions, 
is of traditional design and meafures, primarily, 
knowledge of facts and principles. There is little 
overlapping of specific subject matter althoughall 
course content areas are sampled by each part. 
Care was exercised so that items were not over- 
lapping and one did not contain information toaid 
the testee in answering another. 


Part I is composed of five reading selections, 
four of which are below the ninth grade level of 
reading difficulty, the fifth similar to the text- 
ual material read throughout the term. The av- 
erage length is about four hundred words. Three 
are largely conversational, picturing situations 
familiar to the students, e.g., studying for an 
examination in a dormitory room. Two discuss 
results of experimental investigations: one is 
written for laymen; the other is sonrewhat more 
technical. 


Whereas Horrocks and Troyer tested diagnosis 
and remediation of hypothetical problems for a 
course in human growth and development, the 
authors view application of course content for 
elementary psychology in a somewhat different 
light. When one attempts to use his knowledge 
of behavior in a situation, he asks himself, ‘‘What 
is happening? Why is it happening? What prin- 
ciples are involved?’’ Questions illustrative of 
this pattern of application were constructed: 


What kind of reaction to her lack of 
study is indicated by Susan’s statement, 
“I can’t understand a word in that book! ’’ 


(1) logical; (2) repressive; (3) projective; 
(4) habitual 





JOURNAL OF EXPERIMENTAL EDUCATION 


(Differentiating ratio: . 43)1 


In reference to the following question, the room- 
mate has remarked that the examination would 
be a ‘‘lead pipe.’’ Susan mentally added ‘‘cinch”’ 
to complete the metaphor: 


What is illustrated by Susan’s addition 
of ‘‘cinch’’ to ‘‘lead pipe’’ resulting in 
the expression, ‘‘lead pipe cinch’’? 


(1) rational elaboration of ideas; (2) Mul- 
ler-Lyer principle; (3) principle of con- 
tinuity; (4) principle of closure 





(Differentiating ratio: . 16) 


Lashley’s experiment on retention of habits after 
losses of cerebral tissue is the subject of the fol- 
lowing. ‘‘Rats were trained first, then operated 
upon; then retention was determined by measur - 
ing the amount of practice required to relearn 
the maze perfectly.’’ (Excerpt from reading sel- 
ection. ) 


How was retention of a habit determined? 
(1) recall; (2) delayed reaction; (3) sav- 
ing method; (4) error count 


(Differentiating ratio: .28) 


It will be apparent from the foregoing examples 
that the student is required to test a limited num- 
ber of solutions to the problem. In a real situa- 
tion, the number of possible solutions is usually 
much greater. But there is required here not 
only an understanding of course material, but al- 
so an ability to recast this material in the form 
of hypotheses for the solution of a variety of prob- 
lems. i 

Preliminary validation of Part I was deter- 
mined by its coverage, the apparent validity of 
each item based upon its objective of measuring 
application, and the criticisms of the writers’ 
colleagues. The eighty items were chosenon this 
basis from an original group of one hundred and 
twenty-five. Part Il, a test of knowledge, was 
constructed in a like manner. The coefficient of 





(Vol. XXI 


reliability, based upon a population of one hundred 
thirty-three students, was determined by the split- 
half method. The Spearman-Brown formula ap- 
plied to the half-test coefficient yielded an esti- 
mated reliability of . 828. 


Results 


The total test was administered as a final ex- 
amination at one sitting to a class of one hundred 
and sixty-five students from the College of Agri- 
culture and Home Economics at Cornel Univer- 
sity. The two-hour limit was sufficient for all to 
complete the examination. 

Validation of the test was approached indirect- 
ly in the absence of a suitable criterion. Product- 
moment correlations with various measures of 
aptitude and achievement are presented in Table 
I. The relationship between term average and 
Part I, r = .693, when compared with that for 
Part II is rather surprising in light of the objec- 
tive of the first part. We might have expected a 
somewhat lower coefficient if we are, indeed, 
measuring something different from knowledge. 
This will be clarified below. On the basis of co- 
efficients from .10 to .25 between the case study 
tests and the Ohio State, Horrocks (4) concludes 
that ‘‘....given a basically good intelligence, ad- 
ded increments of intelligence in the superior 
range do not add to ability to succeed onthe test 
in question. ’’2 It may be questioned whether the 
Ohio State measures ‘‘intelligence’’ per se. It 
seems to measure reading and vocabulary. It 
does seem significant, however, that Part I is 
less closely related to this test of academic apti- 
tude, r = .390, than is Part II, r = .520; and the 
same may be said of the Cooperative Science Test 
(.388 and . 483). This is especially so since Part 


I correlates significantly better thandoes Part II 
with both chemistry (.280 and .012) and botany 
(.498 and .388) grades. We may conclude from 
these results that Part I measures:something nec- 
essary for success in chemistry and botany other 
than rote learning, vocabulary or reading. 3 The 
CEEB, V and M score coefficients are not signif- 
icant. This may, perhaps, be attributed to the 
greater complexity of the factors measured in 








1. The differentiating ratio is a simple device for determining the discriminating value of an 
item. It is the quotient of the number in the lowest 27% of the population who answer the 
item correctly divided by the number in the highest 27% who answer correctly. 


2. Yet Troyer (5) states, concerning the self-same test, "....some students do very well in diag- 
nostic and remedial items from Part I, but as the data become more complex and complete, they 
4o less well." This is reflected also in the present instrument. 


3. Admittedly, this factor of ability to apply, as it is reflected here, is small. But, as will 
be seen later, size of a factor in a study of this type does not necessarily indicate its in- 


Observation of the effort and close concentration exerted by the students on Part I as con- 
trasted with the comparative ease on Part II sugests that ability to generalize and integrate 
principles plays an important part in application. 





June, 1953) SMITH - GLOCK 


TABLE I 


RELATIONSHIPS WITH APTITUDE AND ACHIEVEMENT OF AN ACHIEVEMENT 
TEST IN GENERAL PSYCHOLOGY 





Part I 





Term Average in Course 

(exclusive of final test) ‘ . 650* 
Ohio State University Psycho- 

logical Test ‘ . 520* 
Cooperative Science Test , . 483* 


Introductory Chemistry Grades : . 012 
Botany Grades ; . 388* 
College Entrance Examination 

Board - V Score ‘ . 146 
College Entrance Examination 

Board - M Score 





*Significant at the . 01 level. 





330 JOURNAL OF EXPERIMENTAL EDUCATION 


this test. Ability in neither the verbal nor the 
mathematical areas is a guarantee of success in 
academic pursuits. 

For further validation of Part I, the following 
reasoning was used. Knowledge, as measured 
by Part II, does not assure ability touse that 
knowledge as measured by PartI. We should ex- 
pect, therefore, that those with the highest grades 
on Part II might not receive the highest grades on 
Part I. Thus, the highest 20% on Part II scored, 
at the mean, 4.5 points lower on Part I than did 
the top 20% on Part I. The CR of the difference 
is 4.1 which is highly significant. 

A final method of validating Part I was an anal- 
ysis of its correlation with Part II, r = .680+.04. 
Correction for attenuation yields a coefficient of 
.818 + .03 which seems to indicate that Part I 
and Part II are measuring, to a large extent, 
though not entirely, the same factors. Horrocks 
and Troyer (5) concluded, on the basis of inter- 
correlations among the case study tests, that 
‘‘application of knowledge to each case tends to 
be highly unique. ’’ Considering the somewhat 
low reliabilities of the case study tests, their ob- 
tained intercorrelations may not reflect the true 
relationships. Corrected for attenuation, those 
coefficients, .55, .39 and .62, become .72, .50 
and .84. Considering the breadth of knowledge 
required by differing emphases on the three tests, 
these appear to be relatively high coefficients. 
Howard (6), in a similar analysis of the relation- 
ship between knowledge and ability to use that 
knowledge, obtained a factor loading of . 11 which 
he assumed was the factor of item complexity, 
i.e., ability to apply knowledge. The remaining 
factors were identified as content. He concluded 
that, within the limits of his study (college level 
science), possession of knowledge is a sufficient 
guarantee of ability to use it. 

We are now in a position to examine further 
the conflicting evidence concerning the relation- 
Ship in question. In each of the studies cited, 
Horrocks and Troyer, Howard, and the present 
investigation, as well as those mentioned in the 
introductory remarks, there appears a small fac- 
tor which is variously labeled intelligence, high- 
er mental processes, generalization and abstrac- 
tion, inference, ability to apply principles, and 
ability to think ina field. It is nearly always a 
small factor which tends either to be overlooked 
or to be magnified. It is small, presumably, be- 
cause the largest single factor in the correlation 
studies must be knowledge of course content. 
This is the sine qua non of application. One must 





(Vol. XXI 


have the knowledge in order to use it. The fac- 
tor of application is difficult to isolate for this 
reason, but it is present in every study. Thatit 
is, in reality, highly important for success ina 
field we know empirically. 


Conclusions 


Despite conflicting evidence on the relation- 
ship between possession of knowledge and appli- 
cation of it, tests purporting to measure applica- 
tion have been constructed. Since we must con- 
sider application of class-learned knowledge a 
principle objective of education, there was de - 
vised for a college course in general psychology 
an achievement test which was meant to test, on 
different parts, application and knowledge. The 
parts consisted, respectively, of eighty and sev- 
enty-nine multiple-choice questions, the first 
eighty based upon reading selections. Both re- 
liability and validity appeared satisfactory. 

Part I, designed to measure application, cor- 
relates most highly with term averages, chem- 
istry grades and botany grades. Part II correl- 
ates most highly with general and scientific apti- 
tude tests. It is conluded that Part I satisfactor- 
ily measures something necessary for success 
which is adequately measured neither by aptitude 
tests nor by traditional subject matter achieve- 
ment tests. How it is related to certain higher 
order mental processes has not been determined. 

Since possession of knowledge of facts and 
principles, per se, is necessary before that knowl- 
edge can be applied, the factor of application, de- 
Spite its practical importance, usually appears 
small or negligible in correlation and factor an- 
alysis studies. It is suggested that this, in ad- 
dition to weaknesses in the studies themselves, 
is the reason for the conflicting results of inves- 
tigations in the relationship between knowledge 
and application. 

It should be noted, finally, that application 
may not be a spontaneous act. There is ample 
evidence that amount of transfer of previous learn- 
ing is influenced by amount and kind of training. 
Judd, furthermore, has stated that ‘‘....the most 
effective use of knowledge is assured not through 
the acquisition of any particular item of exper- 
ience but only through the establishment of asso- 
ciations which illuminate and expand an item of 
experience so that it has general value. ’’*4 The 
teacher is responsible, then, not only for testing 
effective application but also for assisting stud- 
ents to develop that ability. 











. C. H. Judd, Educational Ppetologe, quoted in James B. Stroud's Psychology in Education (New 


York: Longmans, Green and Co., » Pe 592. 





June, 1953) 


SMITH - GLOCK 


BIBLIOGRAPHY 


. Atkins, Dorothy C. Construction and Anal- 
ysis of Achievement Tests (Washington, 
D.C.: U. S. Government Printing Office, 
1947). 








. Brown, Sara Ann. ‘‘Technique for Evaluat- 
ing the Ability of Teachers to Apply Prin- 
ciples Concerned with the Developmental 
Needs of Adolescent Girls, ’’ Journal of Ed- 
ucational Psychology, XLI (1950), pp. 481- 
487. 





. Garrett, Henry E. Statistics in Psychology 
and Education (New York: Longmans, Green 
and Co., 1947). 





. Horrocks, John E. ‘‘The Relationship Be - 
tween Knowledge of Human Development 
and Ability to Use Such Knowledge, ’’ Jour - 
nal of Applied Psychology, XXX (1946), pp. 
501-508. 


. Horrocks, John E., and Troyer, Maurice E. 
‘Case Study Tests of Ability to Use Knowl- 
edge of Human Growth and Development, ’’ 
Educational and Psychological Measure- 
ment, VII (1947), pp. 23-26. 








6. Howard, Frederick T. Complexity of Mental 


Processes in Science Testing, Contributions 


to Education, No. 879 (New York: Teach- 
ers College, Columbia University, 1943). 


. Stroud, James B. Psychology in Education 


(New York: Longmans, Green and Co. , 
1946), p. 592. 


. Tilton, J. W. The Relation Between Associ 


ation and the Higher Mental Processes, 
Contributions to Education, No. 218 (New 
York: Teachers College, Columbia Univer- 


sity, 1926). 


. Tyler, R. W. Constructing Achievement 


Tests (Columbia, Ohio: Bureau of Educa- 
tional Research, Ohio State University, 
1934). 


. Tyler, R. W. Education as Cultivation of 


the Higher Mental Processes (New York: 


The Macmillan Co. , 1936), Ch. I. 


. Wood, Ben D. Measurement in Higher Edu- 





cation (Yonkers, New York: World Book 
Co., 1923). 











AN INVERTED FACTOR ANALYSIS STUDY OF 
STUDENT-RATED INTRODUCTORY 
PSYCHOLOGY INSTRUCTORS 


A. W. BENDIG 
University of Pittsburgh 


THE PROBLEM of defining the important 
and relevant characteristics of teachers has a 
long and somewhat fruitless history. Many at- 
tempts have been made to isolate the essential 
differences between effective and ineffective tea- 
chers, typically for the future construction of 
selection instruments. Autobiographies of gen- 
erally recognized great teachers, descriptions 
written by students about their best-remembered 
teachers, quantitative test records of good and 
poor teachers, and a host of other devices have 
been used. This has been a particularly import- 
ant problem in the selection of elementary and 
secondary school teachers as Beecher’s recent 
review indicates (1). None of the attempts has 
proven very fruitful. Schools of education still 
depend today on the personal evaluation of pros- 
pective teachers by more or less experienced in- 
terviewers. 

Little attention has been devoted to the char- 
acteristics of college teachers in contrast to the 
mass of research on elementary and secondary 
school levels. A recent government report (7) 
suggests that the main objectives in the training 
of college teachers is the development of com- 
petent scholars and research workers and that 
little attention is paid to the development of teach- 
ing skills. 

One method of defining important teaching 
characteristics may prove to be the so-called 
inverted factor or Q-technique of factor analysis. 
A series of quantitative measurements ona group 
of teachers may be intercorrelated and the re- 
sulting matrix of correlations factor analyzed to 
derive a set of independent factors that can ade- 
quately describe the intercorrelations between 
the teachers. Behavorial descriptions of these 
factors may then help us in devising independent 
predictive measures of these important teaching 
variables. 

The present study was concerned with the ap- 
plication of inverted factor analysis techniques 
to a large group of student ratings of introductory 
psychology instructors. It has been previously 
shown that student ratings can reflect individual 
differences between instructors (3). Whether 
from these same ratings significant and independ- 





ent constellations of teaching behaviors can be 
determined was investigated in this research. 


Procedure 


Ten introductory psychology instructors were 
rated by their undergraduate students at the end 
of the fall semester, 1950-51. Each student rat- 
ed his instructor on the fourteen five-choice rat- 
ing scales developed and described by Crannell (4). 
The scales cover many different facets of instruc- 
tor personality, such as organization of course 
material, friendliness toward the students, per- 
sonal appearance, etc. A total of 490 students 
participated in the ratings with individual instruc - 
tors being rated by from 12 to 90 students. Fur- 
ther information on the instructors, students, and 
scales can be found in a previous report (3). This 
previous paper also contains the obtained means 
and standard deviations of each of the fourteen 
scales for 490 daytime students in addition to 
similar information for evening sections of intro- 
ductory psychology. 

The mean rating of each of the ten instructors 
on each of the fourteen scales was ‘computed and 
the raw score deviation of the mean from the 
mean of the scale determined. This deviation 
was given a plus sign if the deviation was toward 
the low (favorable) end of the scale and a negative 
sign if toward the high (unfavorable) end. Since 
individual scales differed in their variability, the 
fourteen deviation scores for each instructor were 
divided by the standard deviations of the scales. 
The basic data on each instructor was a profile 
of fourteen standard scores indicating his posi- 
tion on each scale as being above or below the 
mean of the group and the amount of his deviation 
in standard score units. 

The profile scores of each instructor were 
then correlated (product-moment) with the pro- 
files of each of the other instructors and the cor- 
relational matrix shown in Table I determined. 

This matrix of intercorrelations between in- 
structors was analyzed by standard Thurstone 
centroid techniques with the slight variation that 
in reflecting signs in the original and residual 
matrices the criterion used was to maximize the 





334 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXI 


algebraic sums of the columns of correlations 
rather than to minimize the number of negative 
signs. This variation has been used by Michael, 
Zimmerman, and Guilford (9). On the first an- 
alysis of the matrix the highest correlation in 
each column was used as the communality esti- 
mate in the diagonal of the table. This practice 
was also followed for each of the residual ma- 
trices as suggested by Thomson (10). Analysis 
was discontinued after the extraction of the third 
factor upon the application of Tucker’s criterion 
(11). New communality estimates were made on 
the basis of the three extracted factors, these 
new estimates inserted in the original matrix, 
and the matrix again analyzed. The process was 
repeated a third time and the median absolute 
difference between communality estimates based 
on the second and third analyses found to be. 025. 
On the basis of this small variation iteration of 
the procedure was stopped. 

The problem of rotating the factor axes toa 
‘‘psychologically meaningful’’ position presented 
problems. Pairs of the three extracted factors 
were graphed as the distribution of the ten instruc- 
tors in two-dimensional factor space. The cri- 
terion adopted was to rotate the axes so as to 
maximize the separation of the instructors into 
two or three groups on each factor. The first 
factor extracted adequately met this criterion, 
the second and third were rotated to meet this 
requirement. The original factor loadings for 
all three factors and the rotated values for Fac- 
tors II and III can be found in Table I. 

The problem of naming or describing the ob- 
tained factors in a Q-technique study presents 
some difficulties for which several solutions have 
been used. Guilford and Holley (5) computed the 
product of the factor loading of each individual by 
the rating given by that individual to each of the 
test objects. The sums of the products for each 
object determined the rank order of the objects 
for a given factor and from this rank order the 
authors derive a verbal description of the factor. 
Bendig (2) used the factor loadings of the individ- 
uals on each factor to graph the individuals on a 
linear scale and asked judges who knew the sub- 
jects very well to write a description of a person- 
ality characteristic that would result in the ob - 
tained ordering of the subjects. Factor descrip- 
tions were abstracted on the basis of common 
phrases in the written descriptions. The method 
used in the present study is ‘similar to that used 
by Holley and Buxton (6). In their study factor 
loadings were correlated by biserial correlation 
with the responses of the subjects to each true- 
false item on a test of beliefs. Items correlating 
highest with the factor loadings were used to de- 
scribe the factors. 

In the present study the factor loadings of the 
instructors on each factor were separately cor- 
related (product-moment) with their standard 





scores on each of the fourteen rating scales. A 
positive correlation indicated that instructors 
with high positive loadings on a particular factor 
tended to get favorable ratings on the single scale 
involved and negative correlations the reverse. 
The three rating scales showing the largest abso- 
lute correlations (regardless of the algebraic 
sign) were used to derive a description of each 
factor. The wording of the factor descriptions 
was taken from the wording of the correlated 
scales and no abstraction of verbal content was 
attempted. The following are the descriptions of 
the characteristics of instructors on the positive 
and on the negative extremes of each factor. The 
scales and their correlations with the factor load- 
ings are given after each factor heading. 


Factor I. Scales 2, 6, 10 (-.69, -.66, -.79) 

Content of his classroom presentation is some- 
times dull and uninteresting. Usually keeps fair 
control of the class, but sometimes lets students 
sidetrack him. Usually shows some sense of hum- 
or in class. 

Content of his classroom presentation is fre- 
quently quite interesting and seldom is dull. Al- 
ways keeps things moving smoothly and seldom 
loses control of the class. Has an exceptionally 
good sense of humor. 


Factor II. Scales 1, 4, 7 (.37, -.36, -.38) 

Course material is very well organized. Has 
a fairly friendly attitude toward students, but 
sometimes is variable in his attitude. Usually 
is reasonable as to length of assignments, but 
sometimes is unreasonable. 

Part of the course material is organized, but 
most of it is loosely organized and becomes in- 
definite and confusing. Has an exceptionally 
friendly attitude toward students. Always very 
fair and reasonable toward the length of assign- 
ments. 


Factor III, Scales 5, 8, 12 (-. 44, .45, -.57) 

Occasionally recognizes student effort, but 
sometimes appears indifferent. His examina- 
tions are usually quite fair and reasonable. Shows 
few annoying mannerisms in class, but is not un- 
usually attractive in appearance. 

Exceptionally appreciative attitude toward stud- 
ent effort and encourages it. His examinations 
are sometimes unfair. Has an unusually attrac- 
tive appearance. 


To validate the above factor descriptions four 
members of the departmental faculty who know 
the instructors quite well were given the pairs 
of factor descriptions and asked to rank order 
the ten instructors along each scale from the in- 
structor best described by one of the pair of par- 
agraphs to the instructor best described by the 
other paragraph. Their rankings of the instruc- 





June, 1953) BENDIG 


TABLE I 


CORRELATIONS (PRODUCT-MOMENT) BETWEEN STANDARD SCORE PROFILES OF TEN 
INSTRUCTORS ON STUDENT RATING SCALES 





Instructor 





A 





TABLE 0 


FACTOR LOADINGS OF TEN INSTRUCTORS DETERMINED FROM CORREL- 
ATIONS BETWEEN STUDENT RATING SCALE PROFILES 








Original Rotated 
Instructor 0 0 


A F . 64 ‘ ‘ . 30 





17 , ; .18 
15 . ° . 23 
. 43 ‘ ‘ -41 
. 07 : , -15 
25 , ‘ . 00 
. 36 ‘ ‘ . 33 
-19 ‘ . . 44 
. 38 ° ° . 48 
22 ‘ ‘ . 26 





Percent of 
total 
Variance 

















336 JOURNAL OF EXPERIMENTAL EDUCATION 


tors on each factor were correlated (rank-differ- 
ence rho) with the rankings of the instructors on 
each factor as determined by their loadings. The 
median validity coefficient for each scale was com- 
puted and the inter-judge agreement among the 
four judges was determined by Kendall’s ‘‘coef- 
ficient of concordance’’ (8, p. 80ff). The med- 
ian validity coefficient and the coefficient of con- 
cordance for each factor was: Factor I, .50 and 
.T1; Factor 0, .49 and .56; and Factor II, . 41 
and .53. 


Discussion 


The pairs of factor descriptions of introduc- 
tory psychology instructors reported above would 
seem to be the important point of this study. The 
pictures of important (to the student) constella- 
tions of behavioral characteristics of instructors 
are not those of the conventional ‘‘good teacher,’’ 
but they sound suspiciously like descriptions of 
human beings. Seemingly mutually contradictory 
characteristics are combined in the descriptions, 
yet an image of teachers we all have known be- 
gins to appear from the synthesis of these traits 
within a single factor description. Nor do thede- 
scriptions lack validity when objectively compared 
to the evaluations of judges who are well acquaint- 
ed with the personal characteristics of the in- 
structors involved. No attempt can be made to 





(Vol. XXI 


evaluate which end of the factor continua are 
‘‘good’’ or ‘‘bad’’ traits ofa teacher; that depends 
upon the rational judgment of a judge or upon the 
empirical relationship between these continua and 
some outside criterion of teaching effectiveness, 
such as the amount of course content that is ab- 
sorbed by the students. The reliability or stabil- 
ity of the factor from course to course, from 
discipline to discipline, or from school to school 
is, of course, unknown. Nor is it known whether 
similar factor descriptions would be found if a 
different initial set of rating scales were used. 


Summary 


The student rating scale profiles of ten intro- 
ductory psychology instructors were correlated 
and the matrix of intercorrelations was factor 
analyzed by inverted factor techniques. Three 
factors were extracted, two of which were rotat- 
ed to maximize clustering of instructors into 
groups. The factor loadings of the instructors 
were correlated with their scores on each of the 
fourteen rating scales and the three scales cor- 
relating highest with each scale used to describe 
the extremes of each linear factor. Validity of 
the factor descriptions was determined by correl- 
ating the factor loadings of the instructors with 
the rankings of the instructors on the three fac- 
tors by four competent judges. The median val- 
idity was . 49. 


BIBLIOGRAPHY 


. Beecher, D. E. The Evaluation of Teaching: 
Backgrounds and Concepts (Syracuse, N. Y.: 
Syracuse University Press, 1949). 








. Bendig, A. W. ‘‘A Q-Technique Study of the 
Professional Interests of Psychologists. ’’ 
(To be published. ) 


. Bendig, A. W. ‘‘The Use of Student Rating 
Scales in the Evaluation of Instructors in 
Introductory Psychology, ’’ Journal of Ed- 


ucational Psychology. (In press. ) 


. Crannell, C. W. ‘‘An Experiment in the Rat- 
ing of Instructors by Their Students, ’’ Col- 


lege and University, XXIII (1948), pp. 5- 
11. 


. Guilford, J. P. and Holley, J. W. ‘‘A Fac- 
torial Approach to the Analysis of Variances 
in Esthetic Judgmerits, ’’ Journai of Experi- 


mental Psychology, XXXIX (1949), pp. 208- 
218. 


. Holley, J. W. and Buxton, C. E. ‘‘A Factor- 





ial Study of Beliefs, ’’ Educational and Psy- 
chological Measurement, X (1950), pp. 400- 
410. 








. Kelley, F. J. Toward Better College Teach- 


ing (Washington, D.C.: U. S. Government 
Printing Office, 1950). 





. Kendall, M. G. Rank Correlation Methods 
(London: Griffin, 1948). 





. Michael, W. B., Zimmerman, W. S. and 
Guilford, J. P. ‘‘An Investigation of Two 
Hypotheses Regarding the Nature of the 
Spatial -Relations and Visualization Fac- 
tors, ’’ Educational and Psychological Meas- 
urement, X (1950), pp. 187-213. 








. Thomson, G. The Factorial Analysis of Hum- 
an Ability, 3rd ed., (New York: Houghton- 
Mifflin, 1948). 


. Wright, R. E. ‘‘A Factor Analysis of the Or- 
iginal Stanford-Binet Scale, ’’ Psychomet- 
rika, IV (1939), pp. 209-220. 











JUDGMENTS BY 820 COLLEGE EXECUTIVES 
OF TRAITS DESIRABLE IN LOWER- 
DIVISION COLLEGE TEACHERS 


M. R. TRABUE 
Pennsylvania State College 
State College, Pennsylvania 


THE AMERICAN Association of Colleges 
for Teacher Education appointed in February, 
1948, a sub-committee 1 of its standing Commit- 
tee on Studies and Standards to make a study of 
the ‘‘preparation of college teachers’’. This sub- 
committee, after a number of meetings, decided 
that its chief effort would be to try to clarify the 
directions in which changes should be made in 
the current preparation of college teachers, and 
that the best available source of data on current 
qualifications would be the judgments of college 
executives who regularly employ newly-prepared 
college teachers. Since most young teachers be- 
gin their careers on the faculty by teaching intro- 
ductory courses to first and second year students, 
it seemed appropriate to focus the inquiry on the 
desirable traits possessed by these lower-divis- 
ion college teachers. 

A careful review of previous studies and pub- 
lished reports provided a long list of traits and 
behavior patterns that students, fellow faculty 
members, alumni, and others had considered 
important in college teachers. By eliminating 
items that had little real evidence behind them, 
combining and rephrasing items that were suffic- 
iently similar, and grouping together the items 
that seemed to refer to related types of activities, 
a list of fifty-two traits was obtained and printed 
in Inquiry A. 

College executives were asked to read each 
statement carefully and to indicate, by making a 
check mark in one of three printed columns, the 





amount of weight usually given to that trait when 
employing a new instructor or assistant profes- 
sor. They were also asked to make a check ina 
fourth column opposite any desirable trait ‘‘of 
which you have rarely found evidences on the cre- 
dentials of applicants for teaching positions in 
your institution, ’’ 

A number of special reports have been publish- 
ed, but in each of them the data reported have 
been upon those particular traits which were rat- 
ed ‘‘highly important’’ by 50 percent or more of 
the special groups of a executives to which 
the report was addressed. In order to provide 
a complete record, so that other investigators 
may be able to use the data intelligently, itseems 
desirable to publish now the original form (Inquiry 
A) used, including not only the printed instruc- 
tions and the fifty-two traits, but also the 
frequency with which the 820 college executives 
checked each of the four columns opposite each 
trait. Inthis tabulation, for example, only 10 of 
the 820 executives reported that they considered 
item Ia (‘‘General academic record is high’’) as 
having ‘‘Little Value’’ (undesirable or not very 
important), while 465 of them considered it as 
having ‘‘Real Value’’ (important), 345 checked 
it as having ‘‘Great Value’’ (highly important), 
and only 12 of them reported that this item was 
‘Rarely Noted’’ in the credentials of applicants 
for teaching positions in their institutions. 

The most important findings that do not appear 
in the data reported here are differences among 





1. The other members of the Committee are Dr. Ruth E. Eckert, University of Minnesota; Dr. Karl 
Bigelow, Teachers College, Columbia University; Dean L. D. Haskey, University of Texas; Pres- 
ident John R. Emens, Ball State Teachers College; and President S. M. Brownell, New Haven State 


Teachers College. 


"Characteristics of Lower Division College Teachers Preferred by Executives of Teacher Educa- 
tion Institutions," Third Yearbook, American Association of Colleges for Teacher Education, 


Oneonta, N. Y., 1950. - Ol- 


"Characteristics of College Instructors Desired by Liberal Arts College Presidents," Bulletin, 
Association of American Colleges, XXXVI, No. 3 (October 1950), pp. 374-379. 


"what Traits Should Junior College Teachers Possess?" Junior College Journal, XXI, No. 3 (No- 


vember 1950), pp. 10-12. 








JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXI 


Responses by 820 College Executives 





PREPARATION OF COLLEGE TEACHERS 


Inquiry A 


This inquiry is to be checked by the chief employing officer (President or Dean) of the college. 


INSTRUCTIONS 


In checking the traits and experiences listed below as ‘‘Qualifications of College Teachers,”’ 
please indicate your judgment of the practical importance of each as it affects the success of a 
teacher on your own faculty. Other things being equal, how much weight do you usually give the 
item when evidences of it appear in the ‘‘credentials’’ of an applicant for a position as Instructor 
or Assistant Professor to teach lower-division classes? 


1. Make a check mark (/) in the first column (Little Value) if you consider the item either as 
‘‘undesirable’’ or as ‘‘not very important. ’’ 


2. Check the item in the second column (Real Value) if you consider it ‘‘important. ’’ 
3. Check the item in the third column (Great Value) if you consider it ‘‘highly important. "’ 
In the fourth column (Rarely Noted), make a second check mark opposite any item you have al- 


ready checked as having ‘‘Real Value’’ or ‘‘Great Value’’, but of which you have rarely found evi- 
dences in the credentials of applicants for teaching positions in your institution. 


Added spaces are provided at the end of each section in which to list and to rate additional qual- 
ifications for which you always look when employing teachers for lower-division college classes. 


Relative Importance 


1 2 3 
Qualifications of College Teachers Little Real Great 


Value Value Value 





As a Scholar 
a. General academic record is high 10 465 345 
b. Academic record in his special field is unusually 
high 26 380 414 
c. Has done important research in his field 460 109 
d. Has published scholarly articles or books 461 86 
e. Contributes to meetings of professional and 
scholarly societies 577 125 
. Has earned doctor’s degree 454 247 
g. Holds a graduate degree from a ‘‘noted univer~ 
sity’’ 405 98 
. Graduate major was in a special area of an aca- 
demic subject (e. g., Modern European History; 
Colloidal Chemistry; etc. ) 66 
. Graduate major covered all important divisions 
of his academic subject (e.g., History; Physics; 
Psychology; etc. ) 








June, 1953) 


1 2 3 
Little Real Great 
Value Value Value 


j. Graduate study included all divisions of his sub- 
ject plus extensive work in another broad field 88 391 341 
k. ‘ eee 


As a Teacher 
a. Understands the problems most often met by col- 
lege students in their work 
. Has studied problems of college teaching and of 
its evaluation 
. Has successfully taught his subject in college 
. Has studied the objectives of ‘‘general education’’ 
for college students 
. Has successfully taught college courses for their 
‘‘general education’’ values 
. Has been successful as elementary or secondary 
school teacher 
. Organizes materials and prepares carefully for 
each meeting with class 
. Inspires students to think for themselves and to 
express their own ideas sincerely 
i. Leads students to take responsibility for planning 
and checking their own progress 
. Has demonstrated skill in methods of instruction 
appropriate to his field 
. Has infectious enthusiasm for teaching that in- 
spires students to want to teach 


As a Student Counselor 
a. Is friendly, democratic, tolerant, and helpful in 
his relations with students 
b. Assists students to collect, analyze, and evaluate 
data on their own vital problems 
c. His students voluntarily seek his advice on intim- 
ate personal problems 
. Has studied the techniques of diagnosis and guid- 
ance of college students 
. Has demonstrated unusual competence as a 
counselor of college students 
. Has been successful as leader of young people 
in scouting, club work, camping, etc. 





a College Faculty Member 

. Has studied the special interests, abilities, and 
needs of college students 

. Has studied the purposes, curricula, organiza- 
tion, and procedures of higher education 

. Has participated constructively in departmental 
and generai faculty meetings 

. Contributes effectively in committee planning and 
work 

. Takes broad (rather than departmental) view of 
educational problems 

. Understands the contributions of college instruc- 











JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXI 


1 2 3 
Little Real Great 
Value Value Value 


tion in other fields than his own 15 459 346 
. Regards himself as primarily a college teacher 
(rather than as a subject-matter specialist) 31 251 538 
. Shows active interest in continued professional 
10 


As a Person 
. Has good health and physical vigor 
. Is emotionally stable and mature 
. Has genial personality and sense of humor 
. Is always neat and well groomed 
. Has a wholesome family life 
. Is less than 35 years of age 
. His behavior reflects high ideals 


As a Citizen 

. Is at ease in social situations 

. Is well informed on current events 

. Participates in cultural activities (art, music, 
literature, etc. ) 

. Takes a part in religious activities 

. Is active in civic and welfare groups 

. Engages actively in political work 

. Holds fair-minded attitudes on controversial 
issues 

. Is an effective public speaker 

. Was successful in a non-academic job 

. Has gained a cosmopolitan outlook through 
travel and wide reading 





Please check the type of institution for which your ratings (above) are probably most valid. 
Teachers college School of Education Community or junior college Liberal 
Arts College 


Any comments regarding your chief difficulties in finding qualified college instructors would be 
helpful. Please indicate below the sources to which you most often turn in seeking applicants for 
teaching positions. 


Mail when completed to M. R. Trabue, 102 Burrows Building, State College, Pa. 








June, 1953) 


the executives of different types of colleges and 
differences among those whose institutions are 
located in different sections of the United States. 
The size of these differences on most items was 
surprisingly small.3 In the table published on 
page 136 of the June, 1951, issue of the Journal 
of Teacher Education, for example, the mean 
differences between the extreme percentages of 
‘highly important’’ ratings on the same items 
assigned by executives of junior colleges, col- 
leges for teacher education, and liberal arts col- 
leges was 12.3 percent. The mean difference be- 








TRABUE 341 


tween the extreme percentages by executives of 
colleges in the four different sections of the 
country was only 7.3 percent. While these dif- 
ferences reflect somewhat different philosophies 
of higher education among college executives in 
different parts of the country and in different 
types of institutions, the important facts are the 
differences auwidng the traits themselves, as re- 
ported here in the combined ratings of 419 liberal 
arts college presidents, 204 junior college pres- 
idents, and 197 executives of colleges for teach- 
er education. 


3. The writer will be glad to supply the basic data on these differences to any research worker 


who has real need for them. 








Specifications for Manuscripts 


JOURNAL OF EDUCATIONAL RESEARCH 
+ and the... 


6. Bibliographical notes must be complete and arranged alphabetically. 


The cooperation of all prospective authors in following these rules is 
satan tik te Bb aeiee tar te cent, Paki eee 
neatly, and on time under the best conditions. Promptness in 
economy, and accuracy will be promoted by carefully prepared 





A New 
Educational Statistics Primer 


Written by students for students 


nT ee eee 


© Wes desires more hntelligint reading of techeleel papers is 
education, and 


© Who desires to make easy statistical calculations with speed 
and accuracy 


This pocket edition of statistics explains and illustrates calcula- 
tion procedures for: 

Measures of central position 

Measures of variability 

Zero order correlation and simple regression 

Errors of measurement and sampling 


Multiple correlation and regressions 


oe 


DEMBAR PUBLICATIONS, Inc. 
Madison 8, Wisconsin 





